Skip to content

Comments

Lexical generator#116

Open
TeddyRoncin wants to merge 20 commits intodevfrom
feat/lexical-generator
Open

Lexical generator#116
TeddyRoncin wants to merge 20 commits intodevfrom
feat/lexical-generator

Conversation

@TeddyRoncin
Copy link
Member

Original PR #97

Generates HTML (for email embedding cf. daymail) from lexical rich-content.
This should support all nodes implemented and render the html as displayed in the frontend rich text editor.

Of course, emails have a lack of features and the rendered result won't be exactly the same as original web presentation (eg. the font won't be the same as we cannot safely import a font, etc)

This branch is based on feat/asso/edit


Fonctionnalités manquantes :

  • Gestion des CheckList (style, ajout d'élément pour remplacer le ::before css)
  • Tweak des listes imbriquées (remplacement d'un selecteur css pour masquer le marker)
  • Suppression des marges des paragraphes pour rendre comme sur le front
  • Suppression des marges des listes
  • Conserver le centrage par défaut des headers des tableaux
  • Trouver une façon d'afficher correctement les images quand trop grandes (enlever les pptés width/height pour les mettre dans le style ?)
  • Fix la validation du type 'image'
  • Modifier le texte des tests pour qu'il soit dans le format validé par l'api

Outils manquants :

  • Documentation
  • Tests unitaires : impossibles à faire pour l'instant (jest ne permet pas l'import de happy-dom reéalisé par lexical/headless)

Comment on lines +129 to +131
return html
.replaceAll('class=""', '')
.replaceAll(/(?<=<[^>]+)(?<!\w|")\s+(?=[^>]*>)|(?<=<[^>]*(?:\w|"))\s+(?=>)/g, '');

Check failure

Code scanning / CodeQL

Incomplete multi-character sanitization High

This string may still contain
<script
, which may cause an HTML element injection vulnerability.
This string may still contain
<script
, which may cause an HTML element injection vulnerability.

Copilot Autofix

AI 13 days ago

In general, the fix is to ensure that the final html string returned by generateHTML is run through a robust HTML sanitizer rather than only performing manual regex replacements. This addresses the core issue CodeQL reports: the returned string may still contain <script> or other executable or dangerous markup. Using a tried-and-tested sanitizer also avoids fragile, multi-character regex manipulations that can miss edge cases.

The best targeted fix here is: keep the existing formatting cleanups (removing empty class and extraneous whitespace) but then pass the result through a sanitization function from a reputable library such as sanitize-html. That library will remove or neutralize <script> tags, event-handler attributes, and other dangerous constructs. Concretely, within generateHTML in src/lexical/lexical.module.ts, after the .replaceAll(...) chain, add a call to sanitizeHtml and return its result. Also add an import for sanitize-html at the top of the file. This preserves the existing behavior (HTML structure, styles, etc.) as much as possible while adding security.

Specifically:

  • At the imports section of src/lexical/lexical.module.ts, add import sanitizeHtml from 'sanitize-html';.

  • Change the return expression from:

    return html
      .replaceAll('class=""', '')
      .replaceAll(/(?<=<[^>]+)(?<!\w|")\s+(?=[^>]*>)|(?<=<[^>]*(?:\w|"))\s+(?=>)/g, '');

    to:

    const cleanedHtml = html
      .replaceAll('class=""', '')
      .replaceAll(/(?<=<[^>]+)(?<!\w|")\s+(?=[^>]*>)|(?<=<[^>]*(?:\w|"))\s+(?=>)/g, '');
    
    return sanitizeHtml(cleanedHtml);

No new methods are required beyond using sanitizeHtml; the rest of the code remains the same.


Suggested changeset 2
src/lexical/lexical.module.ts

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/src/lexical/lexical.module.ts b/src/lexical/lexical.module.ts
--- a/src/lexical/lexical.module.ts
+++ b/src/lexical/lexical.module.ts
@@ -10,6 +10,7 @@
 import { HorizontalRuleNode } from '@lexical/extension';
 import { ColorTextNode, ImageNode, RegisteredStyleMap } from './nodes';
 import { patchNodeExportDOM } from './nodes/NodeStyleInjector';
+import sanitizeHtml from 'sanitize-html';
 
 /** @internal */
 export const BUNDLES = {
@@ -126,8 +127,10 @@
       editor.setEditorState(editor.parseEditorState(parsed));
       editor.read(() => (html = $generateHtmlFromNodes(editor)));
     });
-    return html
+    const cleanedHtml = html
       .replaceAll('class=""', '')
       .replaceAll(/(?<=<[^>]+)(?<!\w|")\s+(?=[^>]*>)|(?<=<[^>]*(?:\w|"))\s+(?=>)/g, '');
+
+    return sanitizeHtml(cleanedHtml);
   }
 }
EOF
@@ -10,6 +10,7 @@
import { HorizontalRuleNode } from '@lexical/extension';
import { ColorTextNode, ImageNode, RegisteredStyleMap } from './nodes';
import { patchNodeExportDOM } from './nodes/NodeStyleInjector';
import sanitizeHtml from 'sanitize-html';

/** @internal */
export const BUNDLES = {
@@ -126,8 +127,10 @@
editor.setEditorState(editor.parseEditorState(parsed));
editor.read(() => (html = $generateHtmlFromNodes(editor)));
});
return html
const cleanedHtml = html
.replaceAll('class=""', '')
.replaceAll(/(?<=<[^>]+)(?<!\w|")\s+(?=[^>]*>)|(?<=<[^>]*(?:\w|"))\s+(?=>)/g, '');

return sanitizeHtml(cleanedHtml);
}
}
package.json
Outside changed files

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/package.json b/package.json
--- a/package.json
+++ b/package.json
@@ -73,7 +73,8 @@
     "prisma": "^6.5.0",
     "reflect-metadata": "^0.2.2",
     "rxjs": "^7.8.2",
-    "sharp": "^0.33.5"
+    "sharp": "^0.33.5",
+    "sanitize-html": "^2.17.0"
   },
   "devDependencies": {
     "@faker-js/faker": "^9.6.0",
EOF
@@ -73,7 +73,8 @@
"prisma": "^6.5.0",
"reflect-metadata": "^0.2.2",
"rxjs": "^7.8.2",
"sharp": "^0.33.5"
"sharp": "^0.33.5",
"sanitize-html": "^2.17.0"
},
"devDependencies": {
"@faker-js/faker": "^9.6.0",
This fix introduces these dependencies
Package Version Security advisories
sanitize-html (npm) 2.17.0 None
Copilot is powered by AI and may make mistakes. Always verify output.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+ ça manque d'au moins un p'tit com pour dire ce que ça fait

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ouais je viens de comprendre ce que ça fait, ça permet en gros d'enlever les espaces useless de ton html. Exemple (en remplaçant les espaces par des • pour que ça soit visible) :
<balise••param=•"value•"•></•balise•>
sera modifié en :
<balise•param="value•"></balise>
Mais :

  1. Pas parfait. Par exemple : <•balise•param="value••a">••</balise> donne ça : <•balise•param="value•a">••</balise>, alors qu'on s'attendrait à ce que ça donne : <balise•param="value••a"></balise>
  2. Ton html généré par lexical sera forcément minifié non ?

@codecov
Copy link

codecov bot commented Feb 12, 2026

Codecov Report

❌ Patch coverage is 72.72727% with 33 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.41%. Comparing base (234921a) to head (0277418).

Files with missing lines Patch % Lines
src/lexical/nodes/NodeStyleInjector.ts 55.55% 11 Missing and 5 partials ⚠️
src/lexical/nodes/ImageNode.ts 56.52% 10 Missing ⚠️
src/lexical/nodes/ColorTextNode.ts 66.66% 5 Missing and 1 partial ⚠️
src/lexical/lexical.module.ts 96.77% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##              dev     #116      +/-   ##
==========================================
- Coverage   85.19%   81.41%   -3.79%     
==========================================
  Files         157      126      -31     
  Lines        2824     2540     -284     
  Branches      523      407     -116     
==========================================
- Hits         2406     2068     -338     
- Misses        382      386       +4     
- Partials       36       86      +50     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

"^.+\\.(t|j)s$": "ts-jest"
},
"collectCoverageFrom": ["**/*.ts", "!main.ts"]
"collectCoverageFrom": ["**/*.ts", "!main.ts", "!**/*-res.dto.ts"],
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heuuu, ya un intérêt ?

beforeAll(async () => {
app = await Test.createTestingModule({ imports: [AppModule] }).compile();
});
TimetableServiceUnitSpec(() => app);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add .skip here

func(app);
});
}
suite.skip =
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Le skip devrait être sur la fonction retournée par suite(), pour faire suite().skip() au lieu de suite.skip()()

### Node

Une node est une partie du texte ou un élément visuel que l'utilisateur peut ajouter à son texte. Il peut s'agir par exemple d'images, d'émojis customs, ou de texte avec un fond coloré.
Pour créer de nouvelles capacités/de nouveaux éléments dans le texte formatté, la première étape est de créer une `class` qui hérite de `ElementNode`, `TextNode` ou `DecoratorNode`. Pour voir les méthodes à implémenter, regarde [la doc de lexical](https://lexical.dev/docs/concepts/nodes#creating-custom-nodes). Comme indiqué plus bas dans la doc, tu peux utiliser [`$config` et ne pas implémenter les 3 fonctions statiques](https://lexical.dev/docs/concepts/nodes#extending-elementnode-with-config).
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah nan, pas tutoyer, oskour. Regarder plutôt que regarde, vous pouvez plutôt que tu peux, etc


project = 'EtuUTT'
copyright = '2024, UTT Net Group'
copyright = '2025, UTT Net Group'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2026 x)

updatedElement
.querySelectorAll('[class]')
.forEach((element: HTMLElement) =>
applyStylesToElement(element, Array.prototype.indexOf.call(element.parentElement.children, element)),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pourquoi pas element.parentElement.children.indexOf ? Et si c'est une question de typage, (element.parentElement.children as HTMLElement).indexOf(element)

try {
const editor = createHeadlessEditor({
nodes: BUNDLES[bundle],
onError: () => {},
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pas regardé, mais ça risque pas de laisser passer l'erreur, alors que tu voudrais la catch ?

: keyof typeof CustomStyles;
}
: keyof typeof CustomStyles;
};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

En récursif ça serait plus propre j'imagine (peut-être moins performant tho ? Et encore, ça me paraît pas être des trop gros calculs)

Comment on lines +129 to +131
return html
.replaceAll('class=""', '')
.replaceAll(/(?<=<[^>]+)(?<!\w|")\s+(?=[^>]*>)|(?<=<[^>]*(?:\w|"))\s+(?=>)/g, '');
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+ ça manque d'au moins un p'tit com pour dire ce que ça fait

Comment on lines +129 to +131
return html
.replaceAll('class=""', '')
.replaceAll(/(?<=<[^>]+)(?<!\w|")\s+(?=[^>]*>)|(?<=<[^>]*(?:\w|"))\s+(?=>)/g, '');
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ouais je viens de comprendre ce que ça fait, ça permet en gros d'enlever les espaces useless de ton html. Exemple (en remplaçant les espaces par des • pour que ça soit visible) :
<balise••param=•"value•"•></•balise•>
sera modifié en :
<balise•param="value•"></balise>
Mais :

  1. Pas parfait. Par exemple : <•balise•param="value••a">••</balise> donne ça : <•balise•param="value•a">••</balise>, alors qu'on s'attendrait à ce que ça donne : <balise•param="value••a"></balise>
  2. Ton html généré par lexical sera forcément minifié non ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants