Conversation
mkdocs_drawio/plugin.py
Outdated
| xml_data = unquote(img.info.get("mxfile")) | ||
| diagram_xml = etree.fromstring(xml_data.encode()) |
There was a problem hiding this comment.
What happens when it is just a png image without the mxfile information? I think it would break and would not display anything, correct?
There was a problem hiding this comment.
If it's missing any mxfile data, it'll fail to retrieve the metadata and therefore won't be a valid diagram, causing it to fall back to the existing error handling. I've updated the example to include this case:
I've also updated mkdocs-classic, as I'd missed that and noticed it when I was pulling in the latest changes.
There was a problem hiding this comment.
I think this is undesirable as it could easily happen that users have named files diagram.drawio.png without it containing the mxdata in the png metadata and this PR would hence be a breaking change for those users of this plugin. Can we introduce a special error handling for png files to force rendering/treating them as normal images if the parsing fails?
There was a problem hiding this comment.
Good point, I hadn't considered that people may do that without intentionally saving it as an editable PNG. In this case, it's a valid flow, so I've shifted things around a little and introduced a warning for when the mxfile data is missing on a .drawio.png file.
The flow now is:
- Attempt to extract mxfile xml from file using new retrieve_mxfile method:
- If PNG with no mxfile data, log a warning and return None
- Else, parse the mxfile data into XML and return it
- An exception will be thrown if it fails to read or parse the file
- Handle result:
- If None, it's a PNG so simply skip this diagram to allow it to render normally
- If an exception was thrown, log an error and replace the diagram with "Not a diagram file" message (as before)
- If it successfully retrieved the XML, continue
- Call substitute_with_file method as before, but this time passing in the XML instead of the path and src:
- If successful, return the substitution
- If an exception was thrown, as in step 2, log an error and replace the diagram with "Not a diagram file" message (as before)
Does the flow make sense? Or did you have something else in mind? I have little experience with Python, so happy to tweak things if it can be tidied up further.
pyproject.toml
Outdated
| python = ">=3.8.0,<4.0" | ||
| requests = ">=2.0" | ||
| Jinja2 = ">=3.0" | ||
| Pillow = "<=12.1.1" |
There was a problem hiding this comment.
Pillow is quite excessive for just retrieving a bit of metadata from a PNG. It introduces a significant amount of sub dependencies and C code compilation which easily fails or is just unnecessary overhead in terms of compute and storage. Therefore I suggest we drop pillow in favour of pypng
There was a problem hiding this comment.
As per PNG Spec RFC2083
https://datatracker.ietf.org/doc/html/rfc2083#page-24
This should work for extracting the mxfile contents without the overhead of pillow
import png
r = png.Reader(filename="file.png")
for chunk in r.chunks():
if chunk[0] == b"tEXt":
key, value = chunk[1].split(b"\x00", 1)
if key == b"mxfile":
mxfile = value.decode("latin-1")There was a problem hiding this comment.
This is much better, thanks for your effort =)
mkdocs_drawio/plugin.py
Outdated
|
|
||
| xml_data = unquote(mxfile_metadata) | ||
| return etree.fromstring(xml_data.encode()) | ||
| else: |
There was a problem hiding this comment.
this else is redundant just put the return here
There was a problem hiding this comment.
Well that's quite a blatant case, gave me a good laugh at myself at least! Good spot, and apologies, I must have overlooked it in my moving things around and working out the best flow.
mkdocs_drawio/plugin.py
Outdated
|
|
||
| xml_data = unquote(mxfile_metadata) | ||
| return etree.fromstring(xml_data.encode()) | ||
| else: |
There was a problem hiding this comment.
this else is redundant just put the return here to reduce nesting
There was a problem hiding this comment.
thinking about it move the return to the verify top and invert the if statement
if not src.endswith(".png"):
etree.parse(path.joinpath(src).resolve())this way we adhere to early return principles and you can get rid of the indentation for the complete png extraction logic
There was a problem hiding this comment.
furthermore the regex ignores the casing this statement is case sensitive so convert to lower casing first and then check the contains
There was a problem hiding this comment.
thinking about it move the return to the verify top and invert the if statement
if not src.endswith(".png"): etree.parse(path.joinpath(src).resolve())this way we adhere to early return principles and you can get rid of the indentation for the complete png extraction logic
Absolutely - much nicer.
There was a problem hiding this comment.
furthermore the regex ignores the casing this statement is case sensitive so convert to lower casing first and then check the contains
Ahh good spot, thanks for that. That'd be the sort of thing I'd cover with a unit test normally, but there's none in the project so far. Is that a preference given the size of the plugin? Or would you be happy if I added some? Not done it with Python before, but I see Unittest is built-in?
Also fix uppercase scenario and tidy code flow
Add support for editable png diagrams.