-
-
Notifications
You must be signed in to change notification settings - Fork 372
A more extensible text rendering pipeline #5396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| struct GlyphInfo | ||
| glyph::UInt64 | ||
| font::FreeTypeAbstraction.FTFont | ||
| origin::Point3f | ||
| extent::GlyphExtent | ||
| size::Vec2f | ||
| rotation::Quaternionf | ||
| color::RGBAf | ||
| strokecolor::RGBAf | ||
| strokewidth::Float32 | ||
| end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I can tell this is currently the input to the glyphs!() plot. I'm not sure how big of an issue this is, but having these forces a bunch of attributes to be per glyph. That means a bunch of single value optimization paths in (GL) backends become unreachable. It may also cause some computations to be duplicated (e.g. something like calling to_color.(per_glyph_color) when the color doesn't change), though I'm not sure if there are any such cases here.
For reference:
glyph, origin, extentneed to be per glyph because they are glyph informationfonttechnically doesn't need to be per glyph, butfind_font_for_char()may make it hard to avoid.sizeis just the fontsize if I'm not mistaken. Doesn't need to be per glyphrotation, color, strokecolordon't need to be per glyphstrokewidthmust be a single value in GL backends atm, though that could be changed
For GL backends everthing needs to be either per character or a single value in the end. I'm not sure about CairoMakie, but I'd imagine per-string values would be useful if we ever wanted to do text via <text> blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fully with you that this my duplicate some pieces of data that need not to be duplicated, however having for example font size per glyph means that we can do (even without this PR) do things like this:
text(0,0; text="test", fontsize=[10, 5,30,11])
and secondly, all the current implementations seem to be doing the same kind of duplication (calling collect_vector or something similar, but grouping the parameter by kind and not by glyph. That means that there are a lot of map!(attr, [vector with a lot of symbols], :some_output) do ...calls in the code, rather than a map!(attr, :glyphinfos, :some_output) do ... In CairoMakie the plotting was done per glyph anyway, I did not yet look at GLMakie.
(There is a comment about removing GlyphCollection which I somewhat took as an invitation to do so...)
(There are as well a few comments about how the current implementation is not very optimised, so I think there can be something done here, I am however not sure what the right way would be.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fully with you that this my duplicate some pieces of data that need not to be duplicated, however having for example font size per glyph means that we can do (even without this PR) do things like this:
text(0,0; text="test", fontsize=[10, 5,30,11])
I'm not saying that should be removed. Being able to set things as a single value, per string or per glyph should continue to work. I'm just saying it would be good to design things in a way that tries not to duplicate unnecessarily, if possible.
I think just slimming down glyphs to something like
struct GlyphInfo
glyph::UInt64
origin::Point3f
extent::GlyphExtent
endand pushing the rest through attributes would make me happy here. Then we can do single value and per-string optimizations later without breaking changes to what a glyph is. Though I guess we also need to pass something like text_blocks then. Or maybe GlyphInfo can have an index to string/block it belongs to.
and secondly, all the current implementations seem to be doing the same kind of duplication (calling
collect_vectoror something similar
Yea that's true. Looking at
f,a,p = text(rand(Point2f, 10), text = string.(100:100:1000))
GLMakie.display(f)
p.gl_renderobject.parentshows that color, strokecolor and rotation get duplicated. So I guess having the Glyph struct is no worse performance wise in the end, but we could still try to improve things.
(There are as well a few comments about how the current implementation is not very optimised, so I think there can be something done here, I am however not sure what the right way would be.)
Yea most of those are probably me complaining about the code I've written.
(There is a comment about removing
GlyphCollectionwhich I somewhat took as an invitation to do so...)
The GlyphColleciton was needed before to keep attributes synchronized, iirc. That all moved to various computations instead. We basically just kept it around for compatibility, though I don't think we officially deprecated it.
asinghvi17
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some minor things I noticed while skimming it, this weekend I've marked some time out for a more detailed review
|
@henrik-wolf can you make a tiny PR to the readme or something? Once we merge that you'll have merged a PR to the repo, and CI will run without us having to approve |
Co-authored-by: Anshul Singhvi <[email protected]>
Thank you! I think this is mostly(ish) working now with |
Just opened #5407 |
| # TODO: this is probably massively inefficient... | ||
| register_computation!(attr, collect(keys(attr.outputs)), [:plotspecs, :blocks]) do inputs, changed, cached |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here we connect ALL 80 to 90 outputs of the text compute graph to this edge which creates the final vector of plotspecs. This will probably cause a lot of unnecessary recalculations whenever ANYTHING changes in the plot.
I do not have a better Idea for how to do this, especially since swapping out the string type may mean that different inputs are needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This also brings back synchronization issues:
f,a,p = text(rand(Point2f, 10), text = string.(1:10), color = :red);
Makie.update!(p, arg1 = rand(Point2f, 3), text = string.(100:100:300), color = [:red, :green, :blue])There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. I see. Getting the interactivity working is clearly something that I need to think about a bit more in depth. Maybe the ScalarOrVector type in the glyph collections is worth looking at/keeping.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checked, as I could not remember to have touched the color resolution code in any meaningful way, and this particular update fails as well on 0.24.6, however not immediately (I think the plotlist immediately resolves the plotspecs output), but on drawing the figure. In general, there seem to be a few place in which the ComputeGraph eagerly infers the types of the references, when the recipe allows for other types to be placed there as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, right, that's actually expected to fail. Backends lock in on certain types, so we designed the compute graph to do that too. If it didn't it would just error later in a lot of cases.
I saw that plot!(::PlotList) did something with an observable and got worried it would bring back synchronization errors. I got a bit blinded trying to find an example. Now that I'm looking at it again, I don't think it should be able to cause any new issues. Sorry about that.
I think that also means that your code isn't that bad for performance. You should just get one computation for each update you trigger, no matter how many intermediate nodes are affected. So one from plot.attribute = new_value and also just one from Makie.update!(plot, attrib1 = val1, attrib2 = val2, ...). (The compute graph does two passes. First it marks everything that's outdated. Then it resolves everything, bottom to top, essentially with resolve.(self.parents); self.value = callback(values(self.parents)...).)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that also means that your code isn't that bad for performance. You should just get one computation for each update you trigger, no matter how many intermediate nodes are affected.
While that is true, I loose a lot of the granularity. Every change will recompute everything, while previously certain updates did not need to reconstruct all the glyphs, but just shift them around. Not sure how much of a problem/real performance drawback that will really prove in the end.
| map!(plot.attributes, [:plotspecs, :blocks], :text_boundingboxes) do specs, blocks | ||
| map(blocks) do block | ||
| # TODO: This depends on some other state that I do not know how to wait on... | ||
| mapreduce(p -> boundingbox(p, space), update_boundingbox, plot.plots[1].plots[block], init = Rect3d()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some plots (contour, PolarAxis) want to know the individual bounding boxes on a per string basis (text(Point2f[(0,0), (1,1)]; text=["hello", "world"]) should have two boxes).
However, since every string can now return an arbitrary amount of plots, we need to map the strings to ranges in the PlotSpecs.
This approach here however relies on the plots of the texts plotlist to be full resolved and updated, before we query their boundingboxes.
This does not seem to be the case when dynamically changing the number of strings in a text. This is currently the reason why the test for the PolarAxis fails.
| # FIXME: plotlists overwrite all explicity set attributes on the children | ||
| # HACK: exclude :space from propagating | ||
| if P == PlotList | ||
| filter!(kv -> !in(kv[1], [:space]), attr) | ||
| end | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To get the lines from MathTeX Engine to plot in pixel space, while the glyphs are positioned in data space, plotlist can not inherit/overwrite the space attribute on the child. I think that this is a bug: #5403.
| PlotSpec( | ||
| :LineSegments, | ||
| linesegments_shifted; | ||
| linewidth = line_data.linewidths, | ||
| color = line_data.linecolors, | ||
| space = inputs.markerspace | ||
| ), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can I reproject the final pixel space coordinates back into data space? That way, we could get around #5403.
|
Another thing I noticed is that every string maps to a new text object. E.g. this f,a,p = text(rand(Point2f, 10), text = string.(1:10), color = :red);ends up creating 10 glyph plots, which means 10 render objects in GL backends. Each render object has a significant amount of overhead because it needs to initiate a new data transfer to the GPU and repeat all the overhead that comes with rendering on the GPU. For example, this is about 950µs vs 550µs mean time for me on master: scene = Scene();
for _ in 1:100
text!(scene, rand(Point2f) .- 0.5, text = "test")
end
colorbuffer(scene);
@benchmark colorbuffer($scene)
scene = Scene();
text!(scene, [rand(Point2f) .- 0.5 for _ in 1:100], text = ["test" for _ in 1:100])
colorbuffer(scene);
@benchmark colorbuffer($scene) |
|
The way I sort of envisioned this was to decide the internal plot type based on the type of the array. If it's an So if we go the route that only one child plot (or let's say a few like Glyph + LineSegments) may exist at a time even for array input, the question is still how to dynamically attach, remove and reattach these child plots given the compute graph infra. I thought the connection would stay intact until the type of the |
Conceptually I mostly like the approach of this design where every string maps (mostly) onto a From the one benchmark that was run in the CI, it is clear that the current state of this PR causes some significant performance regression, even when not going into the backend specifics. I would guess that this is mostly due to the whole
I think it is possible to push the for loop over the strings one layer down to the consumer if the types of strings and layouters are all the same. Or something like that. (Maybe even group the text by string and layouter type and then dispatch on those groups?) Then the
Is this a problem with my implementation, or a general problem with the ability of |
|
Now that I have thought a bit more about it, I am wondering how feasible it would be to push the burden of dealing with "deduplicable" data into the backends while having a sensible, but maybe not trivially high performant structure in the frontend?
could produce something like And then the backend can decide if it just wants to walk the tree and draw each element individually, or whether it wants to do some processing like merging the individual glyphs and drawing them in one single call. |
|
Maybe we could have something like this: where the # You could add to the types DynamicText handles by pushing to this
const TEXT_INPUT_TYPES = [String, LaTeXString, RichText]
function plot!(p::DynamicText)
map(p, :text, :blocks) do texts
offset = 0
return map(eachindex(texts), texts) do idx, stringlike
# probably just 1:num_characters_in_text
block = generate_block(stringlike)::UnitRange{Int64}
block = block .+ offset
offset = last(block)
return block
end
end
for (i, T) in enumerate(TEXT_INPUT_TYPES)
# filter strings of one type and save their indices in the full text array
map!(p, :text, [Symbol(:text, i), Symbol(:view, i)]) do texts
filtered_texts = T[]
indices = Int64[]
for (i, strlike) in enumerate(texts)
if strlike isa T
push!(filtered_texts, strlike)
push!(indices, i)
end
end
return (filtered_texts, indices)
end
# Bookkeeping
attribs = Dict{Symbol, Computed}()
attribs[:text] = getproperty(p, Symbol(:text, i))
# Collect all the attributes for the plot type that handles the string type
for name in attribute_names(get_text_plot_type(T))
# filter the appropriate attributes (which may be single value, per string or per character)
map!(p, [name, :blocks, Symbol(:view, i)], Symbol(name, i)) do attrib, blocks, indices
if attrib isa Vector
if length(attrib) == length(blocks)
return attrib[indices]
elseif length(attrib) == last(last(blocks))
return mapreduce(idx -> attrib[blocks[idx]], vcat, indices)
end
else
return attrib
end
end
attribs[name] = getproperty(p, Symbol(name, i))
end
# filter out appropriate positions (already do PointBased conversions for
# this plot type so you always have Point2f or Point3f vectors here)
map!(p, [:converted_1, Symbol(:view, i)], Symbol(:positions, i)) do positions, indices
return positions[indices]
end
# push all the filtered attributes into the plot function appropriate
# to the string type. Do all the specialized stuff there
plotfunc! = get_text_plot_func(T)
plotfunc!(p, getproperty(p, Symbol(:positions, i)); attribs...)
end
endImplementing a new text plot would then be the same as any other recipe. Compatibility with push!(Makie.TEXT_INPUT_TYPES, MyStringType)
Makie.get_text_plot_func(::Type{MyStringType}) = mytextplot!
Makie.get_text_plot_type(::Type{MyStringType}) = MyTextPlotThe number of glyph plots is still more than necessary, but at least it doesn't scale with the number of strings passed. (We could also set With the dedicated recipes for plain text, latex text, etc you also have the option to skip the overhead from |
|
Thank you! This is great! I will need to look at this carefully though, since we really cant have the performance reduce, since text plots are everywhere and are the only plot types which are actually animated by default (for axis zooming etc). I did want to implement this quite differently, but lets see if we can find common ground - after all I should be happy if the result is maintainable, faster and gives us the flexibility we need. I haven't looked indepth at the discussion and the diff, but thought I better say this before its weeks without me finding time to do an indepth review. |
Co-authored-by: Anshul Singhvi <[email protected]>
Description
Fixes #5378
This is a first attempt at refactoring the text rendering pipeline to enable custom behavior with custom string types.
It replaces the current atomic
Textwith a new atomicGlyphs, that is responsible for rendering glyphs at certain positions. Based on the inputs totext!a user can now intercept the rendering pipeline, and decide to draw onlyGlyphs(for normal strings),GlyphsandLinesSegments(for LatexStrings) or something completely different (potentially something like TikZ Images)this is done by defining a
StringLayoutertype, which is then used to dispatch on inlayouted_string_plotspecs. This function returns a vector ofPlotSpecsfor that specific combination of string and layouter.Type of change
Delete options that do not apply:
(I am not sure what part of the functions that text rendering touches are public and which ones are not. Technically, I would guess that this change is not breaking, but it may very well be that there are packages which rely on some of the internal functions.)
This PR has reached a state in which things are barely starting to work again, so I wanted to put it here to get some feedback, before I dive in and clean everything up.
All the examples in the Text documentation work in CairoMakie, apart from those which use rich text or Latex. All other backends I did not yet look into.
There are a few cases in which blocks use now broken functionality, (polar axis, for example), that I need to still look at. (that is also why I commented out the precompilation workloads)
While working on this, I came across a multitude of function that do not seem to be used anymore, but are still kept in the codebase. Is this just an issue of cleanup? Or do we need to keep them around as part of the api?
Checklist