User talk:Robert Ullmann/Mismatched wikisyntax

From Wiktionary, the free dictionary
Jump to navigation Jump to search

Additional mutli-line templates[edit]

{{pl-decl-noun-sing}} is another multi-line template that is generating false positives on this list. Thryduulf 14:30, 5 May 2008 (UTC)[reply]

Yes, I know, and we're going to find another few dozen turning up. Just leave them, or blank the section if you've done most everything else, and the next run will drop them. (this one is already in the stops list for the next run) Robert Ullmann 14:33, 5 May 2008 (UTC)[reply]
You can add them to User:Robert Ullmann/Mismatched wikisyntax/multiline and they will be handled on the next run. Robert Ullmann 23:21, 9 May 2008 (UTC)[reply]

Unbalanced headers[edit]

Can it check for unbalanced headers? I've found a Usage notes header with five equal signs on the left and four on the right. --Panda10 22:41, 9 May 2008 (UTC)[reply]

These are usually done automatically by AF, so not checked for here. And Connel checks all of them after each dump, but we haven't had a dump in a while. Good idea, one of the things that was checked on 'pedia when they did something similar once. Robert Ullmann 23:19, 9 May 2008 (UTC)[reply]

slashes in IPA and SAMPA templates[edit]

Could you check for IPA and SAMPA pronunciations that have an odd number of forward slashes (i.e. {{IPA|/...}}, {{SAMPA|/.../|.../}}, etc. Thryduulf 19:29, 10 May 2008 (UTC)[reply]

Hmm, A couple of rules that match only well-formed template invocations ... if '{{IPA' in line and not ipa.match(line) ... seems like a reasonable idea. But malformed pronunciation lines in general constitute a fairly large set ... Robert Ullmann 14:48, 17 June 2008 (UTC)[reply]

Unbalanced apostrophes

Could it check for unbalanced double and triple apostrophes that we use for italics and bold characters? --Panda10 20:46, 17 May 2008 (UTC)[reply]

Yes, but this is a considerable trickiness. The WM parser uses a heuristic that gets the "right answer" most of the time, but is hardly rigorous. (someone—probably more than one person—has observed that it is impossible to write a BNF normalized specification for the WM syntax ...) I will continue to figure out if there is something useful; the simplest method might work, or might be awful, only a trial would tell.

Consider a quotation: "I don' wanna'!"

Now consider the wikisytax, if "wanna'" is the word to be bolded:

I don' '''wanna''''!

and WM renders: I don' wanna'! correctly. You wanna' parse that shite? (grins ;-) Robert Ullmann 23:03, 17 May 2008 (UTC)[reply]

Multiline templates

Is it possible for this to check multiline templates properly? A reasonable number of the current ones are listed here for that reason. Conrad.Irwin 23:59, 24 May 2008 (UTC)[reply]

(Not sure what you mean by "properly"?) It checks templates that are expected or often multiline differently from most which either should not be, or are only very exceptional. {{sa-decl-noun}} is expected to be multiline, but if {{IPA}} is then there is certainly a problem. Templates that show up because they are not being checked as multiline can be added to the subpage (which is read by the program); but that list shouldn't include others. For example, {{en-noun}} should not be on that list because it will skip checks that we want 99.9985% of the time. (I'm not making that number up; there are 67,064 entries in the noun cat, and only one that I've have to add to the stops list.) Robert Ullmann 17:24, 25 May 2008 (UTC)[reply]

X-SAMPA

I think contents of X-SAMPA template should also be ignored {{X-SAMPA|["h{nt{]}} Maro 22:54, 14 April 2010 (UTC)[reply]

  • You should probably also add [..{..] and [..}..] to find any non-templated pronunciations that use square brackets rather than slashes (/..}../ is already searched for). These should be picked up and fixed by the pronunciations exceptions report, so there isn't a need for this report to flag them. Thryduulf 11:35, 18 April 2010 (UTC)[reply]
Done, and done. Will re-run presently. Robert Ullmann 14:33, 20 April 2010 (UTC)[reply]

Is the multiline templates list being fully read?

In the latest run, entries like Moldavija have appeared on the list, despite the only issue being with multiline template sh-decl-noun, which is noted in the intro blurb as being "ignored at present". At least one of the entries in the Cyrillic section was using {{ru-verb-1-pf}} which is also listed. Thryduulf 16:24, 20 April 2010 (UTC)[reply]

tripple braces around SAMPA

This report finds where the wrong number of braces around most templates are used, e.g. {{{en-noun}}, but as it ignores SAMPA templates will it find {{{SAMPA|/"In.stINkt/}} (a typo I've just made, but spotted before saving)? Thryduulf (talk) 11:14, 8 May 2010 (UTC)[reply]

multiline template problem

The report is now flaging the closing }} of mutli-line templates as an issue, even when the template is listed on the multiline templates subpage. e.g.

  • distrarre (edit)
    imp3s=distragga|imp3p=distraggano}}
  • do (edit)
    |passage=Next morning, they woke about ten o'clock, Kev, went for a shower while Alice, '''did''' some toast, put the kettle on, and when he came out, she went in.}} <!-- author thinks he is making clever artistic statement with these commas -->
  • dodawać (edit)
    pp=dodawan|pp2=dodawani|ip=dodawano|vn=dodawanie}}

I guess it is ignoring the first part of the multi-line template as it should, but finishing at the end of that line rather than the end of the template. Thryduulf (talk) 00:59, 10 May 2010 (UTC)[reply]

end of tables[edit]

I've seen a few tables where the closing syntax |} is one space in from the start of the line, and these should be excluded from this report (i.e. both of the closing syntaxes below shouldn't trigger an entry in the report)

|}
 |}

See e.g. koillinen#Finnish for an example of the second. Thryduulf (talk) 10:40, 11 June 2010 (UTC)[reply]

indented tables[edit]

As at tro#Derived terms, indented tables appear to be generating false positives. The code should ignore all instances of {| preceded by colons, no matter how many colons there are. Thryduulf (talk) 15:04, 22 June 2010 (UTC)[reply]

Layout[edit]

As [] and {} are wiki characters it is more important to match these up. Could the report be separated to have high-priority mismatches - [] {} - and low priority mismatches ()?--Bequw τ 04:08, 25 June 2010 (UTC)[reply]

I've seen lots of examples of mixed curly brackets and parentheses, about equally split between those that should be one and those that should be the other. If you do down the high and low priority route then in cases with mixed curly brackets and parentheses if both ends are single, e.g. (...}, then treat as low priority but if one or or both ends are doubled, e.g. {{...}), treat as high priority. This wont be foolproof, but it wont be any worse than what we have now. Thryduulf (talk) 08:43, 25 June 2010 (UTC)[reply]

Quotations[edit]

Quotes can be unbalanced (usually in terms of parenthesis). We shouldn't balance them w/o checking the source material. This might be too complicated but one way to remove these would be to disregard the passage= parameter. --Bequw τ 15:21, 1 July 2010 (UTC)[reply]

RTL links[edit]

Is it possible to balance the parenthesis for the wikimedia links on مهر? --Bequw τ 20:25, 12 July 2010 (UTC)[reply]