Skip to content

Conversation

@henrik-wolf
Copy link
Contributor

Description

Fixes #5378

This is a first attempt at refactoring the text rendering pipeline to enable custom behavior with custom string types.

It replaces the current atomic Text with a new atomic Glyphs, that is responsible for rendering glyphs at certain positions. Based on the inputs to text! a user can now intercept the rendering pipeline, and decide to draw only Glyphs (for normal strings), Glyphs and LinesSegments (for LatexStrings) or something completely different (potentially something like TikZ Images)

this is done by defining a StringLayouter type, which is then used to dispatch on in layouted_string_plotspecs. This function returns a vector of PlotSpecs for that specific combination of string and layouter.

Type of change

Delete options that do not apply:

  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

(I am not sure what part of the functions that text rendering touches are public and which ones are not. Technically, I would guess that this change is not breaking, but it may very well be that there are packages which rely on some of the internal functions.)

This PR has reached a state in which things are barely starting to work again, so I wanted to put it here to get some feedback, before I dive in and clean everything up.

All the examples in the Text documentation work in CairoMakie, apart from those which use rich text or Latex. All other backends I did not yet look into.

There are a few cases in which blocks use now broken functionality, (polar axis, for example), that I need to still look at. (that is also why I commented out the precompilation workloads)

While working on this, I came across a multitude of function that do not seem to be used anymore, but are still kept in the codebase. Is this just an issue of cleanup? Or do we need to keep them around as part of the api?

Checklist

  • Added an entry in CHANGELOG.md (for new features and breaking changes)
  • Added or changed relevant sections in the documentation
  • Added unit tests for new algorithms, conversion methods, etc.
  • Added reference image tests for new plotting functions, recipes, visual options, etc.

@github-project-automation github-project-automation bot moved this to Work in progress in PR review Nov 9, 2025
Comment on lines +1 to +11
struct GlyphInfo
glyph::UInt64
font::FreeTypeAbstraction.FTFont
origin::Point3f
extent::GlyphExtent
size::Vec2f
rotation::Quaternionf
color::RGBAf
strokecolor::RGBAf
strokewidth::Float32
end
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell this is currently the input to the glyphs!() plot. I'm not sure how big of an issue this is, but having these forces a bunch of attributes to be per glyph. That means a bunch of single value optimization paths in (GL) backends become unreachable. It may also cause some computations to be duplicated (e.g. something like calling to_color.(per_glyph_color) when the color doesn't change), though I'm not sure if there are any such cases here.

For reference:

  • glyph, origin, extent need to be per glyph because they are glyph information
  • font technically doesn't need to be per glyph, but find_font_for_char() may make it hard to avoid.
  • size is just the fontsize if I'm not mistaken. Doesn't need to be per glyph
  • rotation, color, strokecolor don't need to be per glyph
  • strokewidth must be a single value in GL backends atm, though that could be changed

For GL backends everthing needs to be either per character or a single value in the end. I'm not sure about CairoMakie, but I'd imagine per-string values would be useful if we ever wanted to do text via <text> blocks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fully with you that this my duplicate some pieces of data that need not to be duplicated, however having for example font size per glyph means that we can do (even without this PR) do things like this:
text(0,0; text="test", fontsize=[10, 5,30,11])

and secondly, all the current implementations seem to be doing the same kind of duplication (calling collect_vector or something similar, but grouping the parameter by kind and not by glyph. That means that there are a lot of map!(attr, [vector with a lot of symbols], :some_output) do ...calls in the code, rather than a map!(attr, :glyphinfos, :some_output) do ... In CairoMakie the plotting was done per glyph anyway, I did not yet look at GLMakie.

(There is a comment about removing GlyphCollection which I somewhat took as an invitation to do so...)

(There are as well a few comments about how the current implementation is not very optimised, so I think there can be something done here, I am however not sure what the right way would be.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am fully with you that this my duplicate some pieces of data that need not to be duplicated, however having for example font size per glyph means that we can do (even without this PR) do things like this:
text(0,0; text="test", fontsize=[10, 5,30,11])

I'm not saying that should be removed. Being able to set things as a single value, per string or per glyph should continue to work. I'm just saying it would be good to design things in a way that tries not to duplicate unnecessarily, if possible.

I think just slimming down glyphs to something like

struct GlyphInfo
    glyph::UInt64
    origin::Point3f
    extent::GlyphExtent
end

and pushing the rest through attributes would make me happy here. Then we can do single value and per-string optimizations later without breaking changes to what a glyph is. Though I guess we also need to pass something like text_blocks then. Or maybe GlyphInfo can have an index to string/block it belongs to.

and secondly, all the current implementations seem to be doing the same kind of duplication (calling collect_vector or something similar

Yea that's true. Looking at

f,a,p = text(rand(Point2f, 10), text = string.(100:100:1000))
GLMakie.display(f)
p.gl_renderobject.parent

shows that color, strokecolor and rotation get duplicated. So I guess having the Glyph struct is no worse performance wise in the end, but we could still try to improve things.

(There are as well a few comments about how the current implementation is not very optimised, so I think there can be something done here, I am however not sure what the right way would be.)

Yea most of those are probably me complaining about the code I've written.

(There is a comment about removing GlyphCollection which I somewhat took as an invitation to do so...)

The GlyphColleciton was needed before to keep attributes synchronized, iirc. That all moved to various computations instead. We basically just kept it around for compatibility, though I don't think we officially deprecated it.

Copy link
Member

@asinghvi17 asinghvi17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some minor things I noticed while skimming it, this weekend I've marked some time out for a more detailed review

@asinghvi17
Copy link
Member

asinghvi17 commented Nov 13, 2025

@henrik-wolf can you make a tiny PR to the readme or something? Once we merge that you'll have merged a PR to the repo, and CI will run without us having to approve

@henrik-wolf
Copy link
Contributor Author

Just some minor things I noticed while skimming it, this weekend I've marked some time out for a more detailed review

Thank you! I think this is mostly(ish) working now with CairoMakie. There are some things that I am slightly concerned about, I will put some guiding comments into the review.

@henrik-wolf
Copy link
Contributor Author

@henrik-wolf can you make a tiny PR to the readme or something? Once we merge that you'll have merged a PR to the repo, and CI will run without us having to approve

Just opened #5407

Comment on lines +44 to +45
# TODO: this is probably massively inefficient...
register_computation!(attr, collect(keys(attr.outputs)), [:plotspecs, :blocks]) do inputs, changed, cached
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we connect ALL 80 to 90 outputs of the text compute graph to this edge which creates the final vector of plotspecs. This will probably cause a lot of unnecessary recalculations whenever ANYTHING changes in the plot.

I do not have a better Idea for how to do this, especially since swapping out the string type may mean that different inputs are needed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also brings back synchronization issues:

f,a,p = text(rand(Point2f, 10), text = string.(1:10), color = :red);
Makie.update!(p, arg1 = rand(Point2f, 3), text = string.(100:100:300), color = [:red, :green, :blue])

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I see. Getting the interactivity working is clearly something that I need to think about a bit more in depth. Maybe the ScalarOrVector type in the glyph collections is worth looking at/keeping.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checked, as I could not remember to have touched the color resolution code in any meaningful way, and this particular update fails as well on 0.24.6, however not immediately (I think the plotlist immediately resolves the plotspecs output), but on drawing the figure. In general, there seem to be a few place in which the ComputeGraph eagerly infers the types of the references, when the recipe allows for other types to be placed there as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right, that's actually expected to fail. Backends lock in on certain types, so we designed the compute graph to do that too. If it didn't it would just error later in a lot of cases.

I saw that plot!(::PlotList) did something with an observable and got worried it would bring back synchronization errors. I got a bit blinded trying to find an example. Now that I'm looking at it again, I don't think it should be able to cause any new issues. Sorry about that.

I think that also means that your code isn't that bad for performance. You should just get one computation for each update you trigger, no matter how many intermediate nodes are affected. So one from plot.attribute = new_value and also just one from Makie.update!(plot, attrib1 = val1, attrib2 = val2, ...). (The compute graph does two passes. First it marks everything that's outdated. Then it resolves everything, bottom to top, essentially with resolve.(self.parents); self.value = callback(values(self.parents)...).)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that also means that your code isn't that bad for performance. You should just get one computation for each update you trigger, no matter how many intermediate nodes are affected.

While that is true, I loose a lot of the granularity. Every change will recompute everything, while previously certain updates did not need to reconstruct all the glyphs, but just shift them around. Not sure how much of a problem/real performance drawback that will really prove in the end.

Comment on lines +133 to +136
map!(plot.attributes, [:plotspecs, :blocks], :text_boundingboxes) do specs, blocks
map(blocks) do block
# TODO: This depends on some other state that I do not know how to wait on...
mapreduce(p -> boundingbox(p, space), update_boundingbox, plot.plots[1].plots[block], init = Rect3d())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some plots (contour, PolarAxis) want to know the individual bounding boxes on a per string basis (text(Point2f[(0,0), (1,1)]; text=["hello", "world"]) should have two boxes).
However, since every string can now return an arbitrary amount of plots, we need to map the strings to ranges in the PlotSpecs.

This approach here however relies on the plots of the texts plotlist to be full resolved and updated, before we query their boundingboxes.
This does not seem to be the case when dynamically changing the number of strings in a text. This is currently the reason why the test for the PolarAxis fails.

Comment on lines +733 to +738
# FIXME: plotlists overwrite all explicity set attributes on the children
# HACK: exclude :space from propagating
if P == PlotList
filter!(kv -> !in(kv[1], [:space]), attr)
end

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To get the lines from MathTeX Engine to plot in pixel space, while the glyphs are positioned in data space, plotlist can not inherit/overwrite the space attribute on the child. I think that this is a bug: #5403.

Comment on lines +41 to +47
PlotSpec(
:LineSegments,
linesegments_shifted;
linewidth = line_data.linewidths,
color = line_data.linecolors,
space = inputs.markerspace
),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can I reproject the final pixel space coordinates back into data space? That way, we could get around #5403.

@ffreyer
Copy link
Collaborator

ffreyer commented Nov 14, 2025

Another thing I noticed is that every string maps to a new text object. E.g. this

f,a,p = text(rand(Point2f, 10), text = string.(1:10), color = :red);

ends up creating 10 glyph plots, which means 10 render objects in GL backends. Each render object has a significant amount of overhead because it needs to initiate a new data transfer to the GPU and repeat all the overhead that comes with rendering on the GPU. For example, this is about 950µs vs 550µs mean time for me on master:

scene = Scene();
for _ in 1:100
    text!(scene, rand(Point2f) .- 0.5, text = "test")
end
colorbuffer(scene);
@benchmark colorbuffer($scene)

scene = Scene();
text!(scene, [rand(Point2f) .- 0.5 for _ in 1:100], text = ["test" for _ in 1:100])
colorbuffer(scene);
@benchmark colorbuffer($scene)

@jkrumbiegel
Copy link
Member

The way I sort of envisioned this was to decide the internal plot type based on the type of the array. If it's an AbstractVector{String} we can use a single efficient Glyph plot for everything. Either mixing types in vectors should not be allowed so we can always use one child plot, or we'd only use the "one plot for every element" method as a fallback. But I'm not sure it has to be supported, sure it could have its uses for changing one value in a ticklabel array lets say, but it does create quite a bit of complexity.

So if we go the route that only one child plot (or let's say a few like Glyph + LineSegments) may exist at a time even for array input, the question is still how to dynamically attach, remove and reattach these child plots given the compute graph infra. I thought the connection would stay intact until the type of the text argument changes, in which case there'd be a reconfiguration to new child plot type.

@henrik-wolf
Copy link
Contributor Author

ends up creating 10 glyph plots, which means 10 render objects in GL backends

Conceptually I mostly like the approach of this design where every string maps (mostly) onto a Glyphs plot. (Of course, for latex strings there are two different plots that are returned. From a design perspective I would ideally want to have another SingleText plot, that has as many subplots as your string layouting wants to produce, like a Glyphs and a LineSegments for latex strings. That way we would not have to deal with the whole :blocks thing and the different levels of combined or individual bounding boxes: you want the individual boxes of each string? just do map(boundingbox, text.plots))

From the one benchmark that was run in the CI, it is clear that the current state of this PR causes some significant performance regression, even when not going into the backend specifics. I would guess that this is mostly due to the whole plotlist machinery.

The way I sort of envisioned this was to decide the internal plot type based on the type of the array. If it's an AbstractVector{String} we can use a single efficient Glyph plot for everything. Either mixing types in vectors should not be allowed so we can always use one child plot, or we'd only use the "one plot for every element" method as a fallback. But I'm not sure it has to be supported, sure it could have its uses for changing one value in a ticklabel array lets say, but it does create quite a bit of complexity.

I think it is possible to push the for loop over the strings one layer down to the consumer if the types of strings and layouters are all the same. Or something like that. (Maybe even group the text by string and layouter type and then dispatch on those groups?)

Then the glyphs recipe could take a vector of GlyphCollections as well as a vector of positions and offsets for these collections. Internally, we then join them together, much like we did with the texts previously.
That would mean that every Glyphs needs to expose multiple bounding boxes, which we later need to match up with the bounding boxes of the other plots (at the moment mostly LineSegments) that may or may not be produced.

So if we go the route that only one child plot (or let's say a few like Glyph + LineSegments) may exist at a time even for array input, the question is still how to dynamically attach, remove and reattach these child plots given the compute graph infra. I thought the connection would stay intact until the type of the text argument changes, in which case there'd be a reconfiguration to new child plot type.

Is this a problem with my implementation, or a general problem with the ability of plotspec to infer these kinds of changes?

@henrik-wolf
Copy link
Contributor Author

Now that I have thought a bit more about it, I am wondering how feasible it would be to push the burden of dealing with "deduplicable" data into the backends while having a sensible, but maybe not trivially high performant structure in the frontend?

text(rand(Point2f, 3); text = ["text", L"\frac{1}{2}", rich("test")])

could produce something like

Text
├─ PlotList
│  ├─ SingleText
│  │  ├─ Glyphs
│  ├─ SingleText
│  │  ├─ Glyphs
│  │  ├─ LineSegments
│  ├─ SingleText
│  │  ├─ Glyphs

And then the backend can decide if it just wants to walk the tree and draw each element individually, or whether it wants to do some processing like merging the individual glyphs and drawing them in one single call.

@ffreyer
Copy link
Collaborator

ffreyer commented Nov 14, 2025

Maybe we could have something like this:

DynamicText
├─ PlainTextPlot
│  ├─ Glyphs
├─ LaTeXTextPlot 
│  ├─ Glyphs
│  ├─ LineSegments
├─ RichTextPlot
│  ├─ Glyphs

where the DynamicText just handles the distribution of inputs to child plots, which are created once when the plot is initialized.

# You could add to the types DynamicText handles by pushing to this
const TEXT_INPUT_TYPES = [String, LaTeXString, RichText]

function plot!(p::DynamicText)
    map(p, :text, :blocks) do texts
        offset = 0
        return map(eachindex(texts), texts) do idx, stringlike
            # probably just 1:num_characters_in_text
            block = generate_block(stringlike)::UnitRange{Int64}
            block = block .+ offset
            offset = last(block)
            return block
        end
    end

    for (i, T) in enumerate(TEXT_INPUT_TYPES)
        # filter strings of one type and save their indices in the full text array
        map!(p, :text, [Symbol(:text, i), Symbol(:view, i)]) do texts
            filtered_texts = T[]
            indices = Int64[]
            for (i, strlike) in enumerate(texts)
                if strlike isa T
                    push!(filtered_texts, strlike)
                    push!(indices, i)
                end
            end
            return (filtered_texts, indices)
        end

        # Bookkeeping
        attribs = Dict{Symbol, Computed}()
        attribs[:text] = getproperty(p, Symbol(:text, i))

        # Collect all the attributes for the plot type that handles the string type
        for name in attribute_names(get_text_plot_type(T))
            # filter the appropriate attributes (which may be single value, per string or per character)
            map!(p, [name, :blocks, Symbol(:view, i)], Symbol(name, i)) do attrib, blocks, indices
                if attrib isa Vector
                    if length(attrib) == length(blocks)
                        return attrib[indices]
                    elseif length(attrib) == last(last(blocks))
                        return mapreduce(idx -> attrib[blocks[idx]], vcat, indices)
                    end
                else
                    return attrib
                end
            end
            attribs[name] = getproperty(p, Symbol(name, i))
        end

        # filter out appropriate positions (already do PointBased conversions for
        # this plot type so you always have Point2f or Point3f vectors here)
        map!(p, [:converted_1, Symbol(:view, i)], Symbol(:positions, i)) do positions, indices
            return positions[indices]
        end

        # push all the filtered attributes into the plot function appropriate 
        # to the string type. Do all the specialized stuff there
        plotfunc! = get_text_plot_func(T)
        plotfunc!(p, getproperty(p, Symbol(:positions, i)); attribs...)
    end
end

Implementing a new text plot would then be the same as any other recipe. Compatibility with DynamicText would be given by

push!(Makie.TEXT_INPUT_TYPES, MyStringType)
Makie.get_text_plot_func(::Type{MyStringType}) = mytextplot!
Makie.get_text_plot_type(::Type{MyStringType}) = MyTextPlot

The number of glyph plots is still more than necessary, but at least it doesn't scale with the number of strings passed. (We could also set visible = false when a plot is empty to avoid rendering cost, since those are skipped. There would still be the upfront cost of creating a render object though.)

With the dedicated recipes for plain text, latex text, etc you also have the option to skip the overhead from DynamicText by just using them directly.

@SimonDanisch
Copy link
Member

Thank you! This is great! I will need to look at this carefully though, since we really cant have the performance reduce, since text plots are everywhere and are the only plot types which are actually animated by default (for axis zooming etc).
If we refactor text it has to become faster and not slower and I do have some ideas about that, which may make merging this harder :(

I did want to implement this quite differently, but lets see if we can find common ground - after all I should be happy if the result is maintainable, faster and gives us the flexibility we need.

I haven't looked indepth at the discussion and the diff, but thought I better say this before its weeks without me finding time to do an indepth review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Work in progress

Development

Successfully merging this pull request may close these issues.

Proposal for a more extensible text rendering pipeline

5 participants