Greek wiktionary uses '0', '00' and '000' as parameters#363
Conversation
c1494ed to
803b416
Compare
|
I am trying to fix a Lua error, and I think I've figured out what's wrong: In Greek wiktionary Module:labels we have -- about languages, language specifics CHECK [[takeout]] [[φάλαινα]]
print('gl: "', args['γλ'], '" lang: "', args['lang'])
local lang_iso = args['γλ'] or args['lang'] or '' -- or args[2] at [[Template|ετ]]
if lang_iso == '' or lang_iso == nil then
if label == 'αμερ' or label == 'αμερ γρ' or label == 'αμερ σημασία'
or label == 'βρετ' or label == 'βρετ γρ' or label == 'βρετ σημασία'
then lang_iso = 'en'
else lang_iso = 'el'
end
end
print('*' .. string.byte(lang_iso) .. '*', languages)
local lang_name = languages[lang_iso].name or ''and it turns out that |
aaaaargh I don't get why these template arguments don't get stripped properly. This is probably unrelated to the |
|
We should be able to pass this test, but current we get result of "nilnilnilnil": def test_el_zero_arg(self):
self.wtp.start_page("Πρότυπο:ετ")
self.wtp.add_page(
"Module:test",
828,
"""local export = {}
function export.test(frame)
--print(mw.dumpObject(frame.args))
return tostring(frame.args[0]) .. tostring(frame.args["0"]) .. tostring(frame.args[00]) .. tostring(frame.args["00"]) .. tostring(frame.args[42]) .. tostring(frame.args["42"]) .. tostring(frame.args[042]) .. tostring(frame.args["042"])
end
return export""",
)
self.assertEqual(self.wtp.expand(
"{{#invoke:test|test|0=0|00=1|42=2|042=3}}"), "00012223"
) |
|
I'd like to suggest these changes: at _sandbox_phase2.lua line 13, try string key first then try number key: local v = new_args._orig[key]
if v == nil then
local i = tonumber(key)
if i ~= nil then
key = i
else
return nil
end
v = new_args._orig[key]
if v == nil then
return nil
end
endat luaexec.py line 443, only convert "0" and numbers that don't start with "0": if k.isdigit() and (not k.startswith("0") or k == "0"):
k = int(k)
if k < 0 or k > 1000: |
|
For removing white spaces in expanded template arg: def test_el_strip_arg(self):
self.wtp.start_page("θηλυκός")
self.wtp.add_page(
"Module:test",
828,
"""local export = {}
function export.test(frame)
return tostring(frame.args[0])
end
return export""",
)
self.assertEqual(self.wtp.expand(
"{{#invoke:test|test|0={{{1|}}} {{{2|}}}}}"), ""
)at the end of ret = expand_all_templates(v)
if ret.strip() == "":
ret = ""maybe better check empty string at here: |
|
I think that only positive, non-zero integers are allowed in Wikitext parameters, so using The current problem is, I think, that when we do stuff in wikitextprocessor/core expand_recurse: if kind == "T":
# Template transclusion or parser function call.
# Expand its arguments.
print(f"{kind=}, {args=}, {argmap=}")
new_args = tuple(
expand_args(x, argmap).removesuffix("\n")
for x in args
)
print(f"{new_args=}")
parts.append(self._save_value(kind, new_args, nowiki))
continueThat seems to always leave a magic character for What used to be normal magic characters with different values have been replaced by Why hasn't this come up before? I think it's because Lua simply discards the magic character because it's out of Unicode's normal range of characters. But in this case, all the |
|
I think the first "0" arg problem is not fixed, because Could you add a simplified test for the second magic characters problem? Update: This is the third problem, the second is white space in template arguments not striped(maybe they are the same problem or related?)... Update: I think I kind of understand the magic characters problem in "cookies", this does look strange, but I'm not familiar with the Update: I search the code and the |
|
Aaah, there's a check to see if the cookie data already exists in and it's exactly those |
|
So... I guess it's normal? They still expand correctly? |
|
The magic characters end up getting into Lua. They're not expanded. Lua just seems to discard the weird Unicode characters. At least it seems that way... |
|
But aren't arguments got expanded at here before passing args to Lua module function?You mean the "preprocess()" can't expand these encoded |
|
You are absolutely correct, how did I miss that... We might have to strip() the string at that point, somehow. Lua doesn't seem to have a native implementation of just |
|
if not new_args._preprocessed[key] then
local frame = new_args._frame
v = frame:preprocess(v)
if key ~= i then
v = v:match "^%s*(.-)%s*$"
end
-- Cache preprocessed value so we only preprocess each argument once
new_args._preprocessed[key] = true
new_args._orig[key] = vThis seems to do it. The check for |
|
"1= bb " do get striped... and don't forget |
Check out wikitextprocessor PR #363 Issue was that in some Greek templates, we had a ton of `{{#if:...}}` templates as template arguments, and they were written with spaces in between: `| {{#if:...}} {{#if:...}} |`. The ifs left magic characters which prevented stripping before they were expanded into empty strings, so the trimming has to be done later.
|
@xxyzz I had to make changes to one of your tests (Russian, remove \n at the end of unnamed arguments) and I would like you to check I didn't do anything dumb again. |
a6f8c28 to
f8b774e
Compare
Trim whitespace around frame args after expanding them Check out wikitextprocessor PR #363 Issue was that in some Greek templates, we had a ton of `{{#if:...}}` templates as template arguments, and they were written with spaces in between: `| {{#if:...}} {{#if:...}} |`. The ifs left magic characters which prevented stripping before they were expanded into empty strings, so the trimming has to be done later. Co-authored-by: xxyzz <gitpull@protonmail.com>
very helpfully for debugging Lua modules
|
Thanks for taking a look and completing the stuff I missed! 👍 |
https://el.wiktionary.org/w/index.php?title=%CE%A0%CF%81%CF%8C%CF%84%CF%85%CF%80%CE%BF:%CE%B5%CF%84
Apparently parameter names under 1, like '0' or '00' aren't changed to integers... Or that Scribunto handles all parameter names as strings and just does something special with the indexing stuff.
Still need to figure out some things...