Wiktionary talk:Wikitext style

From Wiktionary, the free dictionary
Jump to navigation Jump to search

(2) No whitespace after a heading start, nor before a header end[edit]

Meaning in between the ==='s and the heading text.

In English we have whitespace around words, except when they are bounded by punctuation { . , ; : ? ! “ ‘ ’ ” ( ) [ ] – — }. Putting strings of larger, letter-like characters flush with the text reduces readability. This may not be so bad in the default Win MSIE setup which uses a monospace font, but significantly impairs the scannability and readability for editors who have chosen to use a proportional font in their browser (for the sake of readability).

So I'm opposed to this proposal. The heading markup should be separated from heading text by single space characters. Examples below, showing more-or-less how this appears in my textarea. Michael Z. 2008-07-03 17:39 z

(This is Safari/Mac's default textarea font, Lucida Grande, in 11-px size as determined by Wiktionary's style sheet. As far as I know, this is how all Safari users will see the text.). Michael Z. 2008-07-03 18:23 z

  ==English==
  ===Noun===
  {{en-noun}}
  
  # An example with equals signs flush.
 
 

  == English ==
  === Noun ===
  {{en-noun}}
  
  # Another example, with spaced equals signs.
 
 

Note this is not a "proposal", but quite firmly established. Robert Ullmann 12:27, 4 July 2008 (UTC)Reply
Where is it established? The examples in WT:ELE are set flush, but there is no specific advice.
In Wikipedia I believe either style is acceptable, and examples of both exist in the guidelines, but the original examples copied from meta are all spaced (examples and section). Michael Z. 2008-07-05 01:44 z
Spell-checkers may have problems with words which are not bounded by whitespace or punctuation.
And the Wikimedia software itself includes spaces when it creates new sections (with the talk-page “+” or “new section” link). Michael Z. 2008-07-15 04:27 z
Nevertheless, the community has long held that such spaces should be removed where they occur. This document is simply laying out some of the conventions we've been following for the past few years. --EncycloPetey 04:32, 15 July 2008 (UTC)Reply
“rationale: visual consistency in editing, bot/code tractability”—Contradicting the Wikimedia software's behaviour diminishes consistency. Removing whitespace word separators confounds spell-checkers. Are there examples of how this helps bots or other code? Michael Z. 2008-07-15 22:43 z
there are many things the WM software does in generating new pages that is undesirable, so the fact that the software does something a particular way is no reason to fabor that. Consider this recent addition [1] by a new user who tried the automated article generation tools. There is not whitespace in the headers, but there are bizarre template inclusions that needed to be fixed. --EncycloPetey 05:56, 16 July 2008 (UTC)Reply
What is undesirable about putting whitespace next to words? If the rationale is “visual consistency,” then it is desirable to do what the software does. Michael Z. 2008-07-16 06:12 z
Huh? So if the software is adding pointless templates that interfere with page layout, then we should mess up all pages to be consistent? I don't follow. --EncycloPetey 06:15, 16 July 2008 (UTC)Reply
I don't know what you're referring to. Software is adding spaces which improve readability of wikitext by editors and by software (spell-checkers), and it doesn't affect page layout at all. Our stated rationale of “visual consistency” is being disregarded if we choose to arbitrarily remove those spaces in some proportion of entries and no talk pages. There is solid rationale for leaving them in, and none at all that I can see for removing them. Michael Z. 2008-07-16 16:11 z

(unindented) MediaWiki puts == Heading surrounded by spaces == when you click the + on talk pages. We should thus copy that and put == English == on entries. Otherwise we are being inconsistent. There is no good reason not to do this, and in ignorance of this page I have always done that as it looks nicer to me. 131.111.220.6 22:41, 16 July 2008 (UTC) Conrad.Irwin 22:47, 16 July 2008 (UTC)Reply

Please note that the MW software does not expect whitespace in between the =='s and the heading text. It parses the heading lines with ={1,6}(.+)={1,6} and does not strip the result string. The only reason spaces are "allowed" is that the HTML tidy along the way takes out the leading and trailing spaces inside h[1-6] tags. It is at best an accidental feature.
Note also that the 'pedia MOS shows headers without the spaces, but allows them as optional. (w:WP:HEAD)
The only place the spaces are used in the MW code is one time, in editing a new comment section (the "+" tab):
            if($this->section=="new" && $this->summary!="") {
                $toparse="== {$this->summary} ==\n\n".$toparse;
            }
On the other hand ...

sub-section

works ... as it is explicitly provided for. (don't miss the trailing spaces) Arguing from MW s/w Original intent probably is not very useful ;-)
*sigh* and keep in mind that this is exactly why this isn't a policy document. It is not wrong to use the spaces if you want, but various automation and partly automated edits need a canonical form to keep from switching them back and forth. As well as other things reading them, who can look for "===Noun===" rather than (={1,6})([^=<]+?)={2,6}(.*)$ then group(2) stripped of leading and trailing WS matching "Noun". There are a lot of people who can write simple code, and do, but not all the fiddley stuff. (and it takes 4 regex cases to handle the headers in AF, to catch various imaginative errors ;-) Robert Ullmann 10:40, 18 July 2008 (UTC)Reply
Please don't “*sigh*” condescendingly. Proponents of removing spaces continue to ignore the wrongness of the stated rationale, but I'll keep trying to discuss this with some respect.
I don't understand much of your explanation, but you seem to be arguing about original intent yourself by discussing software internals which are hidden from editors and readers. I don't know if it's important why the software does or doesn't do it a certain way, or which component of Wikimedia adds the spaces. It puts the spaces in every time, and pretending that removing them promotes consistency is senseless. And I still don't see a single cogent argument as to why it's better without spaces either.
Let's either drop the pretence and remove the stated rationale, or adjust the bots so that they actually promote consistency, while making the wikitext more readable by editors and spellcheckers. Michael Z. 2008-07-18 16:20 z
I will say this exactly once more: this is not a proposal, it is very firmly long-established style. See the first version of the present ELE (re-written from prior discussion) from 4 years ago, 18 July 2004. The rationales are valid, no matter how dismissive you are of them.
If you would like to propose changing the established style, used in the layout policy (WT:ELE), then it is up to you to show how changing the 857,000+ entries that are consistent with the style (out of 861K), changing hundreds of programs that read and write wikt entries, and any number of policy document examples and preload templates etc has a benefit proportional to the cost. And this is not the place for that proposal. Robert Ullmann 05:41, 19 July 2008 (UTC)Reply
The rationale is invalid, in addition to the other advantages I mentioned.
visual consistency in editing  Editors work here for months before discovering by accident that the example set by the WM software is considered deprecated by some. This is also inconsistent with normal English orthography.
bot/code tractability  Can you name one or two of the hundreds of programs which malfunction when they encounter spaces in heading titles? They should be adjusted to create code which is consistent with WM, rather than inconsistent. Michael Z. 2008-07-19 15:29 z
Re: "Editors work here for months before discovering by accident that the example set by the WM software is considered deprecated by some": This is as it should be. No one minds if editors include spaces; but our bots will remove those spaces. (Pace Robert, I don't think there's a good reason for it; but I don't think there's a good reason to mind it, either. It does make sense to standardize even such piffling details, because piffling details add up, and it's a waste of automation effort when half the code is handling all the variants MediaWiki allows. It's true that the spaceless standard is counterproductive at pages where section=new is used often — and it's kind of funny to see the archive-bot's first diff after it hasn't run for a while — but if we're talking about entries, which are the pages whose wikisyntax we actually care about, who cares what section=new does?) —RuakhTALK 16:04, 19 July 2008 (UTC)Reply


One Statement: [Insert University Here] has discovered that wordsthataremashedtogether are harder to read than words with whitespace [Insert University Here] encourages all members of the internet community to please, discover your fuck'n spacebar. (But seriously, that IS true, it is much easier for a HUMAN (yes HUMAN why poorly written bots matter is beyond me) to read == words == than ===words=== because the whitespace allows you to jump directly to the word instead of reading "equals equals equals WORD! equals equals equals"

And just to be more annoying 1 + 1 = 2 notice how WhiteSpace is GOOD because (23-535.4/586*524)%6^4 is too hard to read (though ^ should written as 1^2 instead of 2 ^ 3)

(18) Definition line capitalization[edit]

In general, if a definition is a complete sentence, it should start with a capital and end with a period. If it is a word, list of words, or a phrase, it should not. This is covered in WT:ELE.

This doesn't appear to follow practice, or the guideline at WT:ELE#Definitions, which says “Each definition may be treated as a sentence: beginning with a capital letter and ending with a full stop” (my emphasis). Most definition lines are sentence fragments lacking a verb, but they are generally given a capital initial and trailing period for consistency.

No, phrases and single words are not supposed to be given an initial capital or trailing period. (The primary example is FL entries with translations.) Some do, and they should be fixed. Sentences are given capitalization and period (or other punctuation), this differs from other dictionaries where defintions that are sentences are often not punctuated. Operative word in the ELE text you quote is "may". Robert Ullmann 12:26, 4 July 2008 (UTC)Reply
I'm not so sure. —RuakhTALK 14:38, 4 July 2008 (UTC)Reply
Well, at least the “may” of the guideline seems to contradict the “should not” here. If I am interpreting it incorrectly, then let's amend one or the other so that there is no ambiguity. Michael Z. 2008-07-05 01:38 z
The problem is that most editors I've seen are liberal with the meaning of "sentence" for purposes of this situation. I do agree that each definition line should either be treated as if it were a sentence or else be left uncapitalized with no full stop. Also, the definitions within a single language section should all be formatted the same way. --EncycloPetey 04:36, 15 July 2008 (UTC)Reply
I agree that most editors don't follow this. But it appears to me that the problem is that this guideline doesn't reflect common practice, i.e., it doesn't follow the principle of describing “details of the preferred style” of the community (nor is it clearly consistent with the guideline at WT:ELE#Definitions). Michael Z. 2008-07-15 22:52 z
I know of no one who does not follow one of these two style conventions. It is consistent with WT:ELE, because that document is primarily describing the format of English entries, for which a translation into English is never given. The section "Variations for languages other than English" hints that the definition line will be different for non-English entries, but is not explicit and gives no examples. --EncycloPetey 00:56, 16 July 2008 (UTC)Reply
Okay, I just hit a few dozen random entries, and I see what you mean. But I still think that there's a gap in either the sense or the expression of this recommendation and the guideline and common practice.
I see that a large portion of FL terms are defined with single words or lists of them without caps or full stops.
But I think the great majority of definitions for English terms—and for a significant number of foreign terms—the definition is a phrase (not a sentence) with a cap and a period. If we mean “phrase” or “sentence fragment,” then we should write that and not “sentence.”
And if there is no consensus regarding a number of standard forms, then let's write them out somewhere, because right now there is a very large minority of entries which doesn't follow any of these, and I can't find any guidance to correct them with authority.
The inconsistency in the way definitions are written is not very professional. Michael Z. 2008-07-16 01:31 z

(22) No HTML tables[edit]

I'm in favour of the principal, but it should be recognized that there are some potentially useful things which can't be accomplished with wikitables, like use of thead, tbody, tfoot, col, colgroup, row, rowgroup, and possibly the use of certain HTML attributes. Michael Z. 2008-07-03 17:39 z

Agreed. Also, wikitable markup doesn't always interact well with template use, stackable wikiformat characters, and so on. See green-collar for one example. —RuakhTALK 21:33, 3 July 2008 (UTC)Reply
Are there situations where this can't be accomplished by setting up the table as a template or transclusion? That is, the HTML itself is placed in the template, so that it isn't in the entry and therefore won't confuse the bots? --EncycloPetey 04:38, 15 July 2008 (UTC)Reply
I don't think we can set up a general-purpose table template, but I suppose any specific case could be set up as a template that's included in only one page — say, we could create a {{single-use/green-collar}}. Is that something we want to do? It could conceivably affect searching (since the text would no longer be in the entry itself), but we could find ways around that. (And for green-collar it wouldn't be a big deal, since it's just a quotation that'd need it.) —RuakhTALK 12:59, 15 July 2008 (UTC)Reply
What's the point? Burying page content in a template makes things worse in several ways, even if one could “find ways around” the search problem.
Why not just state here that tables should be expressed as wikitables wherever practical? In light of the example above, this would follow actual practice, rather than complicate things for the sake of an unattainable ideal. We write for readers, not for editors, so the best attainable results trump pickiness over idealized wikitext.
Also, the rationale for this one is a bit confusing:
  1. “use of wikisyntax rather than HTML to make it more independent of rendering”—This is unclear to me.
  2. “having all tables generated by the software reduces "illegal" table syntax”—Agreed. “illegal” table syntax should say invalid or incorrectly formed table syntax, or at least have the quotation marks removed.
  3. “editors don't need to learn HTML table syntax”—This is questionable. Learning HTML table syntax, learning wikitable syntax, and learning wiki preformatted block syntax using leading spaces can each be a burden to learn, but HTML tables is the only one that many non-wiki editors already know. No one needs to learn any particular one of these—they can simply enter data formatted per their preference, or unformatted, and allow another editor to improve it.
 Michael Z. 2008-07-15 23:47 z
I'm not sure about #2. I believe all tables are generated by the software; MediaWiki supports HTML-like table markup, but my understanding is that it supports it by parsing the markup, processing it, and ultimately generating XHTML for browsers. —RuakhTALK 00:23, 16 July 2008 (UTC)Reply
I think that's accurate, but I think HTML may offer more opportunities to screw up the code. But come to think of it, closing tags can legally be omitted on most elements in a table in HTML 4, so maybe there isn't such a contrast. Michael Z. 2008-07-16 00:26 z
I think Wiki Idiots need to stop using WikiText. HTML is superior to WikiText at any given moment, people constantly fall back on it to do what WikiText can't (Just a small note, look up brainfuck and tell me WikiText is not imitating it. As for the tables, in my experience I'd say MediaWiki does NOT support HTML but it leaves room for you to utilize it (if you type < then MediaWiki will turn it into a special character, but other than that... if you can get inside a < > you can type whatever you want (or so it seems)). Putting the HTML vs WikiText aside;

1) Use of WikiText makes it Indepentent of HTML (Only thing rendering could mean): Where did YOU learn Web Design? 2) Illegal Table Syntax you say?

Give me money

} (MediaWiki won't say it, but that is a syntax error) 3) And this bullshit about Learning HTML should read "editors who learn the html equivalant of brainfuck should not learn html" HTML IS superior to WikiText, NOT using html in formating is, well, idiotic. But OF COURSE HTML is ugly to look at and a few standards should apply

A) <div> is for idiots. I don't know why wikitext support it but it is for idiots. It's been superceeded to hell and I just spent an hour cleaning up a wikitext table FULL of <div align="center"> when a simple style="text-align:center" at the beginning would do that very thing

B) TBODY is superior to WikiText. I am annoyed that I can't create a template {| <tbody style="background:color"> What advantages does this have? Cellspacing is no longer "color" but is transparent.

C) Did we mention how WikiText is inferior? You claim it is hard to write HTML, that is incorrect. It is hard to READ html... harder still to read WikiText, but with WikiText you don't CARE about the table as much as the content.

D) Thus we get to the REAL point, WikiText exists because people like to be stupid. It's much more difficult to program in WikiText, but it is advantagious because of its minimalism. The syntax drops out allowing the text to be more easily readable. THAT is why people complain about HTML (Again, <div> is for idiots) because HTML is about the program, not the content... and they can't read the "content".


Solutions? Remove <div>. And realize that 99% of all your problems occur because NO ONE LIKES WIKITEXT. That
table I fixed? HORRIBLE HTML, WORSE WikiText... after I finished it looked exactally the same with just a bgcolor tag, and a align="left" (table defaulted to center). Someone CLEARLY used a wikitext table generation program because no one could possibly make such horrible code... willingly. WikiText Table Generation Programs are quite likely to be TRUE WYSIWYG IDEs, allowing direct manipulation of, everything. WikiText just forces you to use it's horrible syntax and mash "show preview" to get the correct outcome.

Seriously now, would it HURT that much to have [table] [row] [col] [col] [row]

instead of THIS insanity? (You don't even need close tags as you wouldn't make a col inside a col or a row inside a row without making a new table [inside the col])


(Just Realize that... Reduces Illegal Syntax my Arse)

(21) No HTML entities or numeric references[edit]

Non-breaking spaces ought to be entered as “&nbsp;” because some web browsers seem to convert Unicode non-breaking spaces to plain spaces when editing (I haven't been able to determine which browsers in my testing). Of course, nbsp’s should be used sparingly because they clutter up the wikitext. Michael Z. 2008-07-15 23:55 z

Principles[edit]

“to help make the wikitext more tractable to bots and other code”—could someone point to particular examples, so we can have some understanding of the problems? Can I get a sense for it by reading about User:AutoFormat? I assume that the point is to reduce the number of errors generated without necessarily eliminating them, since a volume of hand-edited wikitext can be assumed to always have inconsistencies and garbage text.

“reflections of policy”—can we resolve to clearly label and link to the relevant policies? This could sidestep questions and debates which don't belong here. Michael Z. 2008-07-16 00:36 z

What is this document?[edit]

The “Principles” section is too vague on some points. The rationale supplied for many of the points implies that this is more than just a description of prevailing practice. I'm told this is not policy, and not intended to be. Is it a suggestion or a guideline?

What do we do if the point contradicts its rational, as I believe is the case with no. 2?

By whom is the rationale, style, and details “decided on or preferred?” Should contentious issues be taken to the Beer Parlour to seek a broader consensus? (But I'm not suggesting to do this at the moment.) Michael Z. 2008-07-16 00:37 z