User talk:Connel MacKenzie/Normalization of articles

Definition from Wiktionary, the free dictionary
Jump to: navigation, search
Return to Usertalk:CMK or Grease pit.

Contents

Normalization of articles

Return to User talk:Connel MacKenzie.


Preamble and discussion

Hi again Connel.

I wanted to start a non-Beer-parlour thread on the subject of normalizing articles. I used to do it, stopped for a while, started again, and am now trying to stop again. I think you and Stephen also do this so for now I just wanted to discuss it amongst us before taking it to everybody on the Beer parlour. Please invite other contributors you know who normalize though.

I think this needs to be opened up to anyone that wishes to comment, now, before going to WT:BP. --Connel MacKenzie T C 16:36, 6 May 2006 (UTC)
We can bring it to the BP by just {{including}} this page there. But you may first want to move it out of your user talk page space. —Vildricianus | t | 09:33, 7 May 2006 (UTC)
No way, no how. This is gigantic. My initial brain-storming list was a start, but about a quarter of them (or more?) have been contested. What I wish to present to WT:BP is the concise list of non-controversial terms only. As people see the positive effects each of these changes gradually has, we can add the contested or controversial topics to WT:BP separately, one per week. The ones I've struck out must not go any further at this time. But as this working-list settles down, I think we'll have a dozen items to present to the community, with explanation. That is probably too much for most people to grok all at once. Right now, we're still getting new, fresh viewpoints on things that seemed absolutely sound. Let's try to keep it small for a few more hours, then present only a tiny list of the things that we know are OK with everyone. --Connel MacKenzie T C 11:04, 7 May 2006 (UTC)
I should have struck out my comment; I went through all of it afterwards and saw how big it had become. Jeez! —Vildricianus | t | 11:09, 7 May 2006 (UTC)

By normalization I mean making minor changes to formatting which usually make no or little difference to how a user sees the page, but does make them conform more to what each of us thinks of as best when editing a page.

Issues such as where to put blank lines and how many, whether to put spaces inside the == ==, or after asterisks in lists.

A few months ago I modified some of the stuff I had been doing for ages to bring it more into line with what I saw you were doing, but I seem to notice now some other things that Stephen and I might be doing differently, which has made me think we should check with each other about what such changes we've been making and agree with each other, then take what we come up with to the Beer parlour, altering the instructional pages on formatting if need be.

Do you think this is a good idea? Where should we start? — Hippietrail 00:59, 30 April 2006 (UTC)

Well, I started with a list of twelve names, then realized I'd forgotten about another two dozen people. Rather than have anyone feel left out, I'll skip naming names. I'm sure all are welcome to chime in, and some probably will, here.
Because these activities always turn out to be more controversial than imagined, it might be considered improper to have a back room meeting. OTOH, this is not exactly behind closed doors. Furthermore, practically discussing these things in a small group first may wheddle out some of the more bone-headed things.
Where to start is also difficult. I'm not as consistent as I probably should be. Some of it is from semi-automatic changes from my monobook, others are specific requests (over time.)
I also consider the 3rd level headings consolidation efforts to be part of normalization. Those consolidation efforts (that Polyglot started) give en.wikt: a consistent look and feel, that significantly adds to usability. --Connel MacKenzie T C 19:03, 2 May 2006 (UTC)
Hi Connel. Would it be a good idea, for each proposal, to link to a pair of example words showing the yes/no usages. Just so we are certain that we understand what we are talking about? SemperBlotto 08:20, 7 May 2006 (UTC)

The list

1. No whitespace before first language heading

1. I agree. I've always done this. But note that disambiguation see also goes before here, and many times the wikipedia link - but the latter suffers from much variance. — Hippietrail 16:32, 30 April 2006 (UTC)

Right - I always remove blank lines between {{see}} and the first language heading. --Connel MacKenzie T C 23:41, 30 April 2006 (UTC)
1. No whitespace before first language heading.
I agree.—Stephen 09:16, 2 May 2006 (UTC)
I agree. — Paul G 07:21, 7 May 2006 (UTC)
I agree. The 'pedia link should not go before the language header, as it may not apply to all languages on the page. Widsith 10:36, 7 May 2006 (UTC)
Disagree, but only in cases where there is a "SEE" tag or line ahead of the first language header --EncycloPetey 14:00, 7 May 2006 (UTC)
Why? To offset the first heading (vertically) incorrectly? --Connel MacKenzie T C 17:23, 7 May 2006 (UTC)
For section editing purposes, this functions as a separate section before the first section edit possibility. --EncycloPetey 17:38, 7 May 2006 (UTC) (POV on your part. Note that just because something is called "incorrect" doesn't mean it shouldn't be done. In the Netherlands, coffee with cream is termed "koffie verkeerd", which literally translates as "coffee wrong", but that doesn't stop me.)

2. No whitespace after a headerheading start, nor before a header end (i.e. between the == and the actual text.)

2. Do you mean between the == and the actual heading name? (I prefer "heading" to "header" by the way though I sometimes use the latter term due to programming habits) — Hippietrail 16:32, 30 April 2006 (UTC)

Yes, I do. E.g. ==English==, but never == English ==. --Connel MacKenzie T C 23:41, 30 April 2006 (UTC)
2. No whitespace after a headerheading start, nor before a header end (i.e. between the == and the actual text.)
I agree.—Stephen 09:16, 2 May 2006 (UTC)
I agree - these spaces are seen in many entries (mainly older ones, I think), but it is entirely superfluous. — Paul G 07:21, 7 May 2006 (UTC)
I agree. -- EncycloPetey 14:06, 7 May 2006 (UTC)
Realize that this one is actually, in part, determined by software. When you use the "+" tab to add a new section to a talk page, it prompts you for the heading, and when it constructs the == line for you, it does use those spaces. So, strictly speaking, for maximum consistency, we should either declare the spaces to be correct, or change the software to match our recommendation. (But I'm tossing this out as food for thought, not as a serious suggestion that the software needs to be changed.) –scs 14:07, 5 June 2006 (UTC)
I am unsure whether any discussion of this page is still actual, but I disagree here. As I mentioned on Connel’s talk page, it makes navigating easier if the spaces are there. H. (talk) 14:10, 12 March 2007 (UTC)

3. No spaces or tabs after a header

3. Agree. I remove these often. Note that spaces at the end are usually due to copying and pasting from some other software. This is a very minor point for me. — Hippietrail 16:32, 30 April 2006 (UTC)

Not minor at all, as it throws off section editing - resulting in the wrong section being edited. --Connel MacKenzie T C 23:41, 30 April 2006 (UTC)
3. No spaces or tabs after a header
I don't know what this one means. Do you mean word spaces before == or line spaces before the following text?—Stephen 09:16, 2 May 2006 (UTC)
Neither. This is about junk coming after "==English==" on the same line. --Connel MacKenzie T C 13:38, 2 May 2006 (UTC)
  • I agree we have to work around this bug: However, it is our duty to report it. I was looking to see if it's already reported earlier but that machine crashed. If anybody else can do that better than me, please do. — Hippietrail 21:10, 2 May 2006 (UTC)
    • Is it a "bug" or a "known feature"? (Note the lack of smiley.) Really, the "bug" I see is that the MW software doesn't enforce this. --Connel MacKenzie T C 16:39, 6 May 2006 (UTC)
No comment on this until it has been clarified/fixed. — Paul G 07:21, 7 May 2006 (UTC)

4. No comments after header, on the same line as the header (breaks section editing!)

4. Is there a better place for such comments? I have done this from time to time when I've thought it appropriate. My lost parser could handle this. — Hippietrail 16:32, 30 April 2006 (UTC)

Again, as Vild points out below (and I expounded on above) this breaks section editing numbering...so clicking on the [Edit] link on the right side of the page edits the subsequent section instead. --Connel MacKenzie T C 23:41, 30 April 2006 (UTC)
On #4: it seems to be quite important that there is nothing on the same line as the heading since this confuses the numbering of sections (you end up editing different sections than those which you intended to if there are comments after any header). —Vildricianus 18:27, 30 April 2006 (UTC)
4. No comments after header, on the same line as the header (breaks section editing!)
I agree.—Stephen 09:16, 2 May 2006 (UTC)
I agree as with #3. #3, #4, and #5 should be merged. — Hippietrail 21:16, 2 May 2006 (UTC)
Do you mean for the "final" list (whatever that is) or that I should abandon the numbering concept within this page? I separated them originally for no reason whatsoever, but on reflection, they each have different reasons for being problematic. {Shrug}. --Connel MacKenzie T C 16:42, 6 May 2006 (UTC)
I agree - comments should go on the following line. — Paul G 07:21, 7 May 2006 (UTC)
I agree -EncycloPetey 14:06, 7 May 2006 (UTC)

5. No templates after header, on the same line as the header (breaks section editing!)

5. I've also done this. My reasoning was that I wasn't sure that putting a category or a comment (as above) on its own line may introduce an extra blank line on some browsers. — Hippietrail 16:32, 30 April 2006 (UTC)

Again, it breaks section editing numbering. --Connel MacKenzie T C 23:41, 30 April 2006 (UTC)
5. No templates after header, on the same line as the header (breaks section editing!)
I agree.—Stephen 09:16, 2 May 2006 (UTC)
I agree as with #3. #3, #4, and #5 should be merged. — Hippietrail 21:17, 2 May 2006 (UTC)
I agree, as for #4. — Paul G 07:21, 7 May 2006 (UTC)
I agree, and dislike seeing anything on the same line as a header. --EncycloPetey 14:06, 7 May 2006 (UTC)

6. No blank line after any header except for all level two language headings

6. I used to remove such blank lines but it is one of the changes I made after seeing that you were adding them. — Hippietrail 16:32, 30 April 2006 (UTC)

So it was you removing them? Ah ha!  :-) . Well, I think that having the language heading stand out in the edit section is pretty important. (Gah, did I just say "heading" and not "header"? I'm messing up my own conventions now.) Also, AFAIK, this was the convention used by a clear majority of users, when I was learning Wikt: layout style.
6. No blank line after any header except for all level two language headings
I agree.—Stephen 09:16, 2 May 2006 (UTC)
I agree. — Paul G 07:21, 7 May 2006 (UTC)
Not sure. I can't think of any cases where I would disagree outright, but I have to think about this one. --EncycloPetey 14:06, 7 May 2006 (UTC)

7. Templates generally prohibited in headings and translation sections (notable exceptions: {{acronym}}, {{abbreviation}}, {{initialism}}.)

7. what about ((Acronym)) and ((Initialism))? Are these exceptions or should we stop this practice. Personally I've been doing this since it seemed standard but I never liked it. — Hippietrail 16:32, 30 April 2006 (UTC)

I've waffled a few times about this - I think I started this practice way back when. User:Dmh was doing is category/template experiments at the time; I took these to this "extreme" but I've never seen a viable alternative. --Connel MacKenzie T C 23:41, 30 April 2006 (UTC)
Even if the categories are "solved" (e.g. the 4th parent category becomes a separate super-category, but no longer a parent relationship to the other three categories) the template(s) would still need to add two (or more) categories to each entry. Using AWB, it is conceivable to subst: the template and zap the category to the bottom of the page. But AWB is not permitted to do that on en.wikt: just yet. Is it Vild? --Connel MacKenzie T C 19:28, 2 May 2006 (UTC)
So you advoicate removing the templates like {{m}}, {{f}}, {{n}}, {{c}}, {{pl}}, etc from all the translation sections? I would agree with that. --EncycloPetey 14:06, 7 May 2006 (UTC)
No. What? Item #7 is talking only about characters that appear between == and == (or === and ===, or ==== and ====.) --Connel MacKenzie T C 17:26, 7 May 2006 (UTC)
For clarity: leave out the and translation sections part. At first, I didn't get that either, actually, until I realized this was about the {{lang}} templates ({{sv}}, {{nl}} etc). It was, right? Perhaps that deserves a separate section, although most of the work restricted to a dozen more templates to be subst:ed. —Vildricianus | t | 19:06, 7 May 2006 (UTC)

7. Templates generally prohibited in headings and translation sections (notable exceptions: {{acronym}}, {{abbreviation}}, {{initialism}}.)

I agree.—Stephen 09:16, 2 May 2006 (UTC)
I agree. — Paul G 07:21, 7 May 2006 (UTC)
I agree. --EncycloPetey 14:06, 7 May 2006 (UTC)
I agree, and personally I'd go farther and get rid of {{acronym}}, {{abbreviation}}, {{initialism}}, too. –scs 14:10, 5 June 2006 (UTC)

8. Wikification of non-"top 40" languages (as per HT's/SGB's final lists - I could care less.)

8. Stephen and I, and perhaps others have been discussing a more common sense approach than "top 40". The reasoning is that exotic languages remain exotic to the majority of users (readers) even if a fan of such language adds thousands of entries, and a well-known language with few articles still doesn't need to be looked up often. A sub-point is that I always wikify exotic language names also in the level-2 heading for the exact same reason. I know others don't do this and it may be that some undo this. — Hippietrail 16:32, 30 April 2006 (UTC)

As Vild comments below, I think you guys have a decent handle on these. I am mostly ambivalent about the final list. (I have my own POV, of course, but for this I can mostly ignore it.) --Connel MacKenzie T C 23:41, 30 April 2006 (UTC)
On # 8: I once tried to bring this to the public but few people (except the two of you) replied there. See Wiktionary:Translations/Wikification. I'd also reason that "exotic" languages are wikified both in translation sections as in level-2 headers. —Vildricianus 18:27, 30 April 2006 (UTC)
8. Wikification of non-"top 40" languages (as per HT's/SGB's final lists - I could care less.)
I agree.—Stephen 09:16, 2 May 2006 (UTC)
Should merge #8 and #9. — Hippietrail 21:20, 2 May 2006 (UTC)
I agree. — Paul G 07:21, 7 May 2006 (UTC)
I agree, but the language list for the top 40 should be made easier to find. --EncycloPetey 14:06, 7 May 2006 (UTC)
I really don't like the disparity. I understand it's nice to have a handy link to the entry describing a language you've never heard of; I understand such a link is distracting and unnecessary for a language everybody's heard of, but still, it seems odd to me to have wikilinks in headers at all, and just wrong to have this split scheme where sometimes the language header is a link and sometimes it isn't. I wish there were some completely different way to provide the convenience link to the language name, one that didn't have this disparity problem at all. –scs 14:17, 5 June 2006 (UTC)

Please see WT:BP#Language wikification in translation tables round 7 (can be found here now H. (talk) 11:49, 1 March 2007 (UTC)) and comment there. — Vildricianus 15:30, 5 June 2006 (UTC)

9. De-wikification of all "top-40" languages (as defined in previous number)

9. Goes with #8 — Hippietrail 16:32, 30 April 2006 (UTC)

As Vild says below, yes, this list is identical to #8.
9. De-wikification of all "top-40" languages (as defined in previous number)
I agree.—Stephen 09:16, 2 May 2006 (UTC)
Should merge #8 and #9. — Hippietrail 21:20, 2 May 2006 (UTC)
I agree; this goes with #8. — Paul G 07:21, 7 May 2006 (UTC)
I agree. --EncycloPetey 14:06, 7 May 2006 (UTC)

10. One blank line before all subsequent headers

10. I feel strong agreement here and have always done it. I have a pet peeve against "compressed" articles - those which use few or no blank lines. I find them unreadable. — Hippietrail 16:32, 30 April 2006 (UTC)

On #10: complete agreement. —Vildricianus 18:27, 30 April 2006 (UTC)
10. One blank line before all subsequent headers
I agree.—Stephen 09:16, 2 May 2006 (UTC)
I agree - like Hippietrail says, entries with no blank lines between sections are difficult to read and edit. — Paul G 07:21, 7 May 2006 (UTC)
Disagree. I've tended to compress the initial sections (language, etymology, pronunciation) together, except when these sections are more than a line or two. I also disagree for a variety of foreign language situations. We have a current discussion (I forget where, at the moment) concerning the best place to put categorization tags for foreign languages. One favored suggestion is to place them immediately following the language header, which is the one I favor. I see no reason to have an additional blank line after that, though this is a special case. --EncycloPetey 14:23, 7 May 2006 (UTC)
This formatting guideline/recommendation has nothing to do with where the categories go. I said in the category-relevant sections that my opinion about category placement is just too controversial for this conversation. So, it is my opinion that in the case where those categories are "in the relevant section" that there would still be a blank line. e.g.
==Swahili==
[[Category:Swahili language]]

===Noun===
[[Category:Swahili nouns]]
'''@#$$%SDSAa'''  (romanized term)

# def #1
The blank lines here are what are being discussed, not whether to shuffle the categories to the bottom of the page. For foreign language entries, the white space is perhaps more important, as English-only editors have such great difficulty understanding what is there. --Connel MacKenzie T C 17:42, 7 May 2006 (UTC)
  • Note: finding "compressed" entries is a good way of identifying contributions from new Wiktionarians, that may need some gentle guidance. --Connel MacKenzie T C 16:44, 6 May 2006 (UTC)
Good point, although these might be hard to find without writing a special bot to track them down. — Paul G 07:21, 7 May 2006 (UTC)
Well, I haven't written a custom search for these yet, but I plan to.  :-) --Connel MacKenzie T C 09:02, 7 May 2006 (UTC)

11. One blank line after "inflection line" (originally for Polyglot, now this has grown on me.)

11. I always used to do this but grudginly gave it up and practised the opposite when I saw that was what you were doing. I would vote strongly in favour of keeping the headword/inflection/romanization line attached to the heading above it and with a blank line below it. — Hippietrail 16:32, 30 April 2006 (UTC)

Perhaps I worded #6 unclearly. I agree 100% with what you say here. --Connel MacKenzie T C 23:41, 30 April 2006 (UTC)
11. One blank line after "inflection line" (originally for Polyglot, now this has grown on me.)
I don't know what this means.—Stephen 09:16, 2 May 2006 (UTC)
E.g. "===Noun===" / "{{en-noun-reg}}" / <<blank line>> . --Connel MacKenzie T C 13:38, 2 May 2006 (UTC)
Just to make this clearer: "No space between POS line and headword/inflection line, but always a space after" — Hippietrail 21:23, 2 May 2006 (UTC)
I agree (with Hippietrail's clarification). — Paul G 07:21, 7 May 2006 (UTC)
I agree. --EncycloPetey 14:23, 7 May 2006 (UTC)

12. One blank line before a translation section gloss (except first gloss)

12. I have no opinion. Whatever others decide is fine with me. — Hippietrail 16:32, 30 April 2006 (UTC)

12. One blank line before a translation section gloss (except first gloss)
I agree.—Stephen 09:16, 2 May 2006 (UTC)
I agree - the blank line is unnecessary and makes no difference to the rendered page. — Paul G 07:21, 7 May 2006 (UTC)
I agree for the reasons Paul G. gives. --EncycloPetey 14:23, 7 May 2006 (UTC)

13. One blank line before ----

13. Agreed. I've always done this. — Hippietrail 16:32, 30 April 2006 (UTC)

13. One blank line before ----
I strongly disagree on this one. It results in compression, and the visual impact of this would be completely unacceptable in print, and I think it's unacceptable in a webpage as well. As a general rule, pages benefit visually from a little whitespace artfully applied to margins, between paragraphs, and before headers. Putting just a single blank line before ---- has the same effect as putting no blank line. I feel strongly that there should be two blank lines before each ----. —Stephen 09:16, 2 May 2006 (UTC)
  • I've never heard an explanation for this before, so I always thought it was a mistake. Now that I understand it is not a mistake, I'll try to look a few more before forming an opinion on it. --Connel MacKenzie T C 13:38, 2 May 2006 (UTC)
Solved this quite conveniently by adapting monobook.css. Perhaps this is a good candidate for adoption in MediaWiki:Monobook.css:

.ns-0 #bodyContent hr { margin-top: 2.5em; margin-bottom: 0.5em }

(or adapt as you please). —Vildricianus 17:26, 2 May 2006 (UTC)
This is important. Everybody needs ---- in edit mode. Non-monobook users need it always. Am I right in saying most monobook users would prefer it hidden in view mode? If so we can move a version of this code into the standard monobook CSS. But can anybody think of a way to hide the actual line, but still add extra space? This aspect of this topic is beyond normalization and should probably be taken to the Beer parlour. — Hippietrail 21:29, 2 May 2006 (UTC)
I've boldly gone ahead with this - see the Beer parlour. — Hippietrail 22:47, 2 May 2006 (UTC)
I wish I had seen this comment sooner.  :-( I've applied your fix to my personal monobook.css, so I'm content with this. --Connel MacKenzie T C 16:46, 6 May 2006 (UTC)
I've only ever put a blank line before the ----. If it makes a difference (and I didn't know that it did) I'll put one before as well. — Paul G 07:21, 7 May 2006 (UTC)

14. One blank line after ----

14. Agreed. I've always done this. Though I think Stephen is adding two lines here. — Hippietrail 16:32, 30 April 2006 (UTC)

On #14: I find that the Wiki layout renders too little space before and after horizontal dividing lines, which two blank lines may solve. But customizing the top and bottom margin in CSS might be a better solution. —Vildricianus 18:27, 30 April 2006 (UTC)
14. One blank line after ----
I agree.—Stephen 09:16, 2 May 2006 (UTC)
I agree. — Paul G 07:21, 7 May 2006 (UTC)
I agree. --EncycloPetey 14:23, 7 May 2006 (UTC)

15. One blank line before categories

15. Agreed. Didn't used to like it but I've been converted. — Hippietrail 16:32, 30 April 2006 (UTC)

But what about the cases where categories are not placed at the end of the page? This is an on-going discussion, but the issue is that people editing only a single language on pages with multiple language headers need access to the categories for editing. One proposed solution (which I favor) is that for non-English categories, the category tags would be placed immediately following the language header. In this case, I would not favor including a blank line before the categories. --EncycloPetey 14:23, 7 May 2006 (UTC)
Gah. Excellent point. This was written with the assumption that categories were moving to the bottom of the page. So this needs to be clarified to refer only to categories (or groups of categories) appearing at the end of 1) A language section 2) the entire entry. --Connel MacKenzie T C 17:45, 7 May 2006 (UTC)
15. One blank line before categories
I agree.—Stephen 09:16, 2 May 2006 (UTC)
I agree. — Paul G 07:21, 7 May 2006 (UTC)
I agree in cases where the categories appear at the end of the page. --EncycloPetey 14:23, 7 May 2006 (UTC)

16. One blank line before interwiki links

16. Agreed. Between cats and interwikis. — Hippietrail 16:32, 30 April 2006 (UTC)

16. One blank line before interwiki links
I agree.—Stephen 09:16, 2 May 2006 (UTC)
I agree, although I don't think I've ever used interwiki links. — Paul G 07:21, 7 May 2006 (UTC)
I agree. --EncycloPetey 14:23, 7 May 2006 (UTC)

17. controversial: One blank space after "[[Category:"

17. I don't like this one at all. I also don't like the space after * lists but do like the space after # lists (definitions). These are personal feelings which of course don't matter. My feeling is mainly based on my belief that without a space is much more common already in the first 2 cases, in the last case my choice is based on readability. — Hippietrail 16:32, 30 April 2006 (UTC)

Well, I can see that. If there is any (consistency?) reason at all for prohibiting the space after "[[Category:" then we should prohibit it, or else I'll keep doing it (inconsistently.) I like the spaces to make word-wrapping more natural and browser-controlled. --Connel MacKenzie T C 23:41, 30 April 2006 (UTC)
On #17: No. I've so far only seen it being used by Connel, and I don't really like it. —Vildricianus 18:27, 30 April 2006 (UTC)
17. controversial: One blank space after "[[Category:"
I don't suppose it matters, but I never do this, and I remove them when I find them.—Stephen 09:16, 2 May 2006 (UTC)
Connel's argument makes sense so I will go with the popular opinion on this point. — Hippietrail 21:52, 2 May 2006 (UTC)
Some more random thoughts: for the syntax [[:Category: ...]] on a talk page or referenced elsewhere, this may make sense. But in entries, for the [[Category:...]] syntax, I can see the need for consistency. There is no good reason for me to break with tradition on this one. --Connel MacKenzie T C 16:50, 6 May 2006 (UTC)
I don't think that the blank space is necessary. Coming from a programming background, I see this as being just like the C++ scope resolution operator :: for namespaces. The convention there is not to insert any spaces either side of the operator: namespace::class::subclass::method, for example. I think this is sufficiently readable as the colon takes up much less space than a letter and so can easily be skipped by the eye, IMO. — Paul G 07:21, 7 May 2006 (UTC)
  • OK, this one gets stricken. Not controversial: rather, this is not liked by anyone (except me.) I have stopped doing this (in the main namespace) but I'll continue to use it as needed, when discussing long category names in WT:BP or other non-main-namespace places. --Connel MacKenzie T C 08:59, 7 May 2006 (UTC)

18. Definition lines should be sentences, starting capitalized, ending with a period "."

18. I thought our policy was that if they are sentences they must be capitalized and end with a full stop, if they are simple glosses they should be all lower case with no full stop. This I agree with. — Hippietrail 16:32, 30 April 2006 (UTC)

Um, you are correct. --Connel MacKenzie T C 23:41, 30 April 2006 (UTC)
18. Definition lines should be sentences, starting capitalized, ending with a period "."
I strongly disagree with this point. First, very few of the definitions found here are complete sentences. For example, for Armenian, the definition is "1. Of, from, or pertaining to Armenia, the Armenian people or the Armenian language". I believe rewriting the definition as a complete sentence would not be beneficial.
Second, almost no dictionary capitalizes definitions, and in the case of some definitions, capitalization actually causes confusion. If a definition starts with a noun, for example, capitalization sometimes suggests that the word is actually a proper noun.
Third, some words have translations instead of definitions. For instance, the Russian word солнце is "defined" as sun. The capitalized Sun is the wrong meaning. Even if we decide to keep capitalizing definitions, we certainly should not capitalize translations. In my opinion, neither should be capitalized, but parenthetic categorizations such as (Law) or (Slang) should be. —Stephen 09:16, 2 May 2006 (UTC)
For what it's worth, there seem to be two types of definitions that could benefit from sentence-style vs. non-sentence-style differentiation.
  • The common type of definition is one of equivalence. "foo ==Noun== # a type of bar" says that "foo" means "a type of bar".
  • The lesser common type of definition is used where such equivalence is impossible, unclear, or difficult. "the ==Definite article== # The is the definite grammatical article that...." describes "the" but does not provide its equivalence.
It would be great to see the two types of definitions differentiated in some way. Why not format the common type in the simple manner (i.e. no forced initial capital and no period) and format the rare type in sentence style (i.e. forced initial capital and ending in a period)? Hmm, I suppose this belongs on BP instead of Connel's talk page, so feel free to dismiss this post for its location or to ridicule its naïve hope for consensus. Rodasmith 18:34, 2 May 2006 (UTC)
If the above "[...] sentences [...] must be capitalized and end with a full stop [but] simple glosses [...] should be all lower case with no full stop" is current policy, it completely agrees with my above recommendation, so I hereby withdraw the above post. Rodasmith 19:36, 2 May 2006 (UTC)
If this turns out to be controversial, I would no longer regard it as normalization but as a formatting standard that needs to be voted on. I am completely open to thoughtful opinions on this. — Hippietrail 21:56, 2 May 2006 (UTC)
If you mean that the definition of cow, say, should be "A cow is an animal with four legs, one at each corner." rather than "an animal with four legs, one at each corner" - then I disagree. I prefer the second version. SemperBlotto 22:02, 2 May 2006 (UTC)
I'm not sure whom you were addressing, SemperBlotto, but if you were addressing me, then don't worry, I do not mean that definitions of such words should be complete sentence. Most words are best defined with "simple glosses" instead and hence have no need for an initial capital, a full stop, or anything else normally imposed by English grammar on sentences. As Hippie says, though, further discussion would be beyond the scope of normalization. Rodasmith 22:33, 2 May 2006 (UTC)
I think we are soliciting general comments: I don't think SB's comment was directed at anyone. --Connel MacKenzie T C 17:00, 6 May 2006 (UTC)
  • Perhaps I misjudged how controversial this is. I can see this as a topic worthy being voted on. But to my understanding, it was. This was a minor skirmish on Wiktionary talk:Entry layout explained, that I though had been resolved.
  • SemperBlotto, please explain why you think your second format is better. I don't see it at all. To me, the second format you gave is just inconsistent. --Connel MacKenzie T C 17:00, 6 May 2006 (UTC)
This one is a little bit contentious. As far as I am aware, very few, if any, definitions in Wiktionary are sentences. There is a linguistic principle that a definition should be substitutable for the word it defines, so that in a sentence, the word can be replaced by the definition and leave the meaning of the sentence unchanged. Hence you can replace "I saw a cow" with "I saw an animal with four legs, one at each corner" and the sentence has the same sense (provided cows are the only quadrupeds with a leg at each corner). It is not possible to do this with full sentences: *"I saw a cow is an animal with four legs, one at each corner" is incorrectly formed. Children's dictionaries sometimes use this format to make their definitions easier to understand to youngsters, but it is not appropriate for Wiktionary. Newbies and one-off contributors sometimes post "full-sentence" definitions, and when I see them, I change them to "substitutable" definitions.
Having said that, I think the issue here has been incorrectly worded: I see it as being whether definitions should be all in lower-case or should begin with a capital letter and end in a full stop/period. That is not the same as their being full sentences, because, simply put, they are not. (I'll leave it as an exercise to the reader to look up the rules for what is required for a string of words to be a sentence.)
I have no strong views either way, but agree with the points made about possible confusion between the capitalised and lower-case initial letter of the first word. I probably come down in favour of the lower-case form, for several reasons:
  1. Avoidance of the confusion mentioned above.
  2. Most other dictionaries do this (although it should be noted that the OED does not). This could be to save ink or space, which, given that the current OED runs to 20 large volumes, is clearly less of an an issue for them.
  3. Initial capitals and final full stops suggest sentences, when the definitions are not (although see how I am formatting this list :) ).
  4. Consistency with non-English entries, where only translations or glosses are given and these are in lower-case with no final full stop.
  5. It makes it clearer that the definition can be substituted for the word.
  6. It looks silly when the definition is a single word.
  7. Wikification of the initial word is straightforward ([[word]] rather than [[word|Word]]).
  8. Less effort is required.
So I think that probably puts me firmly against this policy. — Paul G 07:21, 7 May 2006 (UTC)
Wow. Well, spot checking a few Webster's 1913 entries, they do all seem to be in sentence form, not replacement form. Perhaps it is a pondian issue, after all? As far as I knew, the replacement form is only used in smaller dictionaries, to save space.
To be honest, I never understood the replacement form until I read your explanation above, Paul. Thank you for the clarity.
For your reason #4, I was suggesting this rule apply to those as well.
For your reason #6, I am pretty sure all English words can be a sentence by themselves. Subject/noun/verb is of course the correct requirement, but two of the three are and often can be implied. Run. (Yup, that's a sentence.) Fire! (Yup, that's a sentence.) I. (Yup, that's a sentence.) No. (Yup, that's a sentence.) Yes. (Yup, that's a sentence.) Spaghetti. (Yup, that's a sentence.) Leeds. (Yup, that's a sentence.) Fishing. (Yup, that's a sentence.)
For your reason #7, that is a very good point. For nouns, I usually precede the term with "A " to avoid that issue; for verbs I use "To ".
--Connel MacKenzie T C 08:48, 7 May 2006 (UTC)
  • I also forgot to mention: the cow example wansn't quite right. Substituting the Wiktionary definition, you'd get "I saw An animal with four legs, one at each corner.." which if you read without regard for capitalization or punctuation, would be coherent, even in your scenario. --Connel MacKenzie T C 08:55, 7 May 2006 (UTC)

I actually don't think this is really part of the normalization process, but something of a different nature. After all, it does make a difference for the reader, doesn't it? So perhaps this has to wait for a second round of discussion, if all this is brought to the BP. But on topic: I'm with Connel here (I think so), and I don't think it's a Pondian issue. I prefer "An animal with four legs, one at each corner." On the other hand, I understand that many definitions are still lacking, and it would be a waste of time to make lacking or embryonic definitions into sentences, as they still need to be expanded. —Vildricianus | t | 09:55, 7 May 2006 (UTC)

I think this has to be taken on an entry-by-entry basis. I feel, with Paul and Stephen, that having to make every definition a full sentence may sometimes be a constraint. However note that with some complicated terms, eg theological terms or abstract concerpts, a ‘replacement form’ is not always possible, and full-sentence (ie natural English) explanation is desirable there. Widsith 10:44, 7 May 2006 (UTC)
I can't see any rationale to support one form over another. The only comment I'd make is that all the definitions for a single word should be formatted in a single form, rather than mixed (which looks bad, as if someone wasn't paying attention to wehat they were doing). --EncycloPetey 14:23, 7 May 2006 (UTC)
I guess my rationale for thinking it was a non-controversial "normalization issue" (which it is not) was that if we are striving for consistency, then all definitions in Wiktionary should appear as sentences. I'd never heard Paul's eloquent rational for the "replacement style" before, and while that does affect my POV, I'm not entirely sold by any means. I still think that to be consistent, Wiktionary should choose a single style...which would force it to be "sentence style." --Connel MacKenzie T C 17:50, 7 May 2006 (UTC)
Wow. Lots of good arguments here, which I haven'd digested all of. Me, I'm strongly in favor of the "replacement style", for precisely the reason Paul described. I'm somewhat in favor of formatting most definitions (even though they be fragments) with initial caps and final periods, if for no other reason than that I think they look better that way. But there are exceptions. Sometimes, for a complicated definition, "replacement style" is unwieldy, and the reader is better served by a full, grammatical sentence. When a definition needs a second (or third) clarifying sentence, those are full sentences even if the first is a fragment. Finally, when a definition is a translation I agree that it's best left unadorned, with neither initial-cap nor final-stop. So I'm guessing -- at least based on my own proclivities -- that the "policy" here would end up involving several decision points and judgement calls, and would not, after all, end up imposing any kind of absolute uniformity, neither decreeing always or never full sentences, nor always or never caps and periods. –scs 14:30, 5 June 2006 (UTC)

19. Never allowing multiple blank lines

19. Agreed. But see #14. — Hippietrail 16:32, 30 April 2006 (UTC)

19. Never allowing multiple blank lines
I agree except before ---- as I described above in point 13.—Stephen 09:16, 2 May 2006 (UTC)
I (somewhat controversially) altered the default monobook CSS to hide the ---- but increase blank space in its place. I think it ought to be default so I've started a topic on the Beer parlour. I think this is more cross-platform that Stephen's solution which in my experience gives different results in different browsers. Please comment. — Hippietrail 20:35, 6 May 2006 (UTC)
Is this resolved by .css now? Can we say "Never allow multiple blank lines" without exception now? --Connel MacKenzie T C 17:01, 6 May 2006 (UTC)
If this has been sorted out, I agree. — Paul G 07:21, 7 May 2006 (UTC)
Disagree in cases where we have inflection lines and images and wikipedia links and... I'm not sure how this should be handled, but multiple blank lines is the only slution I've found for certain cases where all the included items end up overlapping each other. --EncycloPetey 14:23, 7 May 2006 (UTC)
That's considered as bad layout I guess. There should be other solutions such as adjusting margins. —Vildricianus | t | 15:12, 7 May 2006 (UTC)
No Vild, it's not bad layout. EncycloPetey, what I do in those cases is migrate the float=right items around. So, one before the ==English==, one after ===Etymology===, two after ===Noun=== (before the inflection line, with no blank lines) and any others below. Could you please provide an example term or two, so we can all clearly see what you mean? I am pretty sure I still disagree with multiple blank lines even in this circumstance. The worst I've had to do so far, to date, is use {{-}} to help things align...but not multiple blank lines. --Connel MacKenzie T C 17:54, 7 May 2006 (UTC)
Hmm? I think it is (and this is regarding this section's topic, namely two blank lines). What do you think about another point that widely varies between editors: placement of the {{wikipedia}} link? In most instances it is placed under the POS header, but if Ncik's boxy templates are in place, there's overlap between them. BTW, I didn't even know of {{-}}'s existence. Nice thingy. —Vildricianus | t | 19:15, 7 May 2006 (UTC)

20. controversial: Categories to bottom of entire page, not language section

20. controversial: Categories to bottom of entire page, not language section
I have no opinion on this.—Stephen 09:16, 2 May 2006 (UTC)
A related issue is categories between headings where a blank line would otherwise go. I've been leaving these untouched so far. I tend to agree that this one is controversial and that takes it beyond the scope of normalization for now. It would need discussion and perhaps voting on the Beer parlour first. — Hippietrail 22:39, 2 May 2006 (UTC)
See WT:BP#Placing of categories. —Vildricianus 17:47, 3 May 2006 (UTC)
So this should be removed from "The List" and added to "deferred topics"? --Connel MacKenzie T C 17:02, 6 May 2006 (UTC)
I disagree - it is often useful to put categories alongside what they apply to. For example, I routintely put [[Category:English heteronyms]] in the pronunciation section, which makes it clear that this has been considered. Putting it at the bottom can make it look like it has been overlooked.
Another reason for putting categories in the place where there apply is if, for any reason, that content is deleted (perhaps because the content is wrong), it is clear that the category should be deleted as well. If the category is at the bottom, it is very easy to overlook it.
Furthermore, as categories do not add any formatting to the final page (except for being listed at the end), there is no harm in putting categories close to the content they apply to. (Or maybe they do add whitespace - see #26 below.)
Of course, categories embedded in templates cannot be moved. — Paul G 07:21, 7 May 2006 (UTC)
So, you are saying this should be discussed in the "controversial issues" section instead then, right?
Note that many entries already do have all categories moved to the bottom. If you wish to remove a section now, you currently have to check both places; where the category might be relevant, and at the bottom of the entry. And perhaps the end of that language seciton as well. If we decide at some point that categories belong where they are most relevant, who will go back and change the last 71,000 English entries, to comply? --Connel MacKenzie T C 08:33, 7 May 2006 (UTC)
Disagree for non-English. Users editing a single language need to be able to find all categories listed with that language's section. Otherwise, what's the point of being able to edit long pages by section? I don't want to have to hunt for categories that might be at the bottom of the page or might be carried in an included template or might have been dropped somewhere else. I favor placing all categories for non-English languages immediately after the language header. For English, the bottom of the page makes the most sense. --EncycloPetey 14:23, 7 May 2006 (UTC)
Again: you have to "hunt" for them currently anyway. --Connel MacKenzie T C 17:55, 7 May 2006 (UTC)

21. All HTML auto-converted to UTF-8. &amp; -- > &

21. All HTML auto-converted to UTF-8. &amp; --> &
I agree.—Stephen 09:16, 2 May 2006 (UTC)
I agree. — Paul G 07:21, 7 May 2006 (UTC)
I agree. --EncycloPetey 14:31, 7 May 2006 (UTC)
Personally I'd like to agree, because I'm using a fully UTF-8 capable browser now, but that's a pretty recent thing for me. For people who aren't able to (or don't know how to) edit Unicode, HTML entities are preferable. Are we ready to lock such people out?
(Side notes on what I just said: (1) Yes, since most of our entries do use UTF-8 unapologetically, someone who can't handle it is in big trouble already. But I worry -- even though it's a pretty small possibility -- about someone who makes a new entry using HTML character entities, comes back tomorrow to work on it some more, and can't, because the entities have been autoconverted to UTF-8 behind his back. (2) Even for the majority of editors who do have UTF-8 capable browsers, I'm not sure they're all suitably aware of all of the new distinctions which can and sometimes must be made. If you're used to working just in ASCII, or just in 8859-1/Latin1, you've got some surprises in store when you start using Unicode. There are three kinds of double quotes " “ ” and four kinds of apostrophes/single quotes ' ʼ ‘ ’. This ASCII hyphen - is not the same as this en dash – which is not the same as this minus sign − which is not the same as this em dash —. This turned letter ǝ is not the same as this schwa ə (a distinction which I only learned of after observing the difficulties Hippietrail and Paul G had with the rhymes:English:-eɪʃǝn page). And so on. And if you're not aware of these and several other equally subtle distinctions, you can goober things up, as witness the redirect of aren’t to aren't (or, for that matter, from it's to it’s -- everybody caught the inconsistency there, right?). And even if you are aware of the distinctions and your browser does support UTF-8, you may not know how to enter all those strange characters that aren't on your keyboard...) –scs 15:12, 5 June 2006 (UTC)
Well, we already do, by policy. The choice could have gone either way - towards HTML coded for editing or towards UTF-8 for editing. It went towards UTF-8. The recommendation to avoid HTML coding at all costs, is so that Searching works. --Connel MacKenzie T C 15:26, 5 June 2006 (UTC)
Are you talking ahout everywhere, or just in entry names? —scs 16:34, 5 June 2006 (UTC)

22. All HTML tables converted to wikitables

22. All HTML tables converted to wikitables
I don't really understand what this means.—Stephen 09:16, 2 May 2006 (UTC)
In the main namespace especially, we should never see <table>, but instead the {| syntax for specifying tables. --Connel MacKenzie T C 17:04, 6 May 2006 (UTC)
I agree. — Paul G 07:21, 7 May 2006 (UTC)
I agree, but we need a good and easy to find resource available explaining the ins-and-outs of tables, including how all our own templates are set up. Most of what I've needed, I've had to pull from existing examples I've come across here and on Wikipedia. --EncycloPetey 14:31, 7 May 2006 (UTC)
Try m:Help:Table and m:Help:Template. —Vildricianus | t | 14:59, 7 May 2006 (UTC)

I've never seen such HTML tables, only in templates. Perhaps it's a better thing to say that in the main namespace, all wikitables should be hidden in templates, notably the top/mid/bottom ones, and preferably, also inflection templates. —Vildricianus | t | 09:57, 7 May 2006 (UTC)

23. All POS (or equivalent sections) must have at least one line starting with "#" before the next heading

23. All POS (or equivalent sections) must have at least one line starting with "#" before the next heading
I agree.—Stephen 09:16, 2 May 2006 (UTC)
I have a CSS / Javascript idea for those who don't want to see sense numbers for single-sense articles that should overcome any difficulty here. — Hippietrail 22:49, 2 May 2006 (UTC)
I agree. There should not be intrusive headers between any POS header and the content for that header. --EncycloPetey 14:31, 7 May 2006 (UTC)
Confer entry at I love you. --Connel MacKenzie T C 17:04, 6 May 2006 (UTC)
I agree, subject to what Connel says - a few pages exist purely for the purpose of providing translations (see also day after tomorrow). Perhaps #23 can be reworded to take this into consideration. — Paul G 07:21, 7 May 2006 (UTC)
  • I think day after tomorrow deserves a definition. Agreed, it's a bit weird for English, but taking into account that in other languages it's often a single word, I'd say we can just as well define the concept. —Vildricianus | t | 10:02, 7 May 2006 (UTC)
I don't see how this wording doesn't already take that into account. I think my pet peeve I was trying to address when I wrote this was this style:
===Noun===
{{en-noun-reg}}

====Alternative spellings====
*[[alt1]], [[alt2]]

# Noun definition #1.
# Second noun definition.
# Noun definition #3.
In that, I think the intevening "alt spells" is absolutely unacceptable, for several reasons.
  1. I have yet to see where an alternate spelling applies to only a single part of speech.
    (Oh, I bet there is one! I think I had an example once, but I forget. —scs 15:18, 5 June 2006 (UTC))
  2. There already is a less intrusive place for alternate spellings (before etymology.)
  3. Interleaving them makes parsing harder, as any program must assume the definitions "belong" to the alternate spellings heading, instead of the noun heading.
  4. Even in the case where it is possible for alternate spellings to exist for a single part of speech, the sub-heading (as all other sub headings) needs to follow the definitions, not precede them.
--Connel MacKenzie T C 08:28, 7 May 2006 (UTC)
Absolutely agree (despite the I love you and day after tomorrow examples.) —scs 15:18, 5 June 2006 (UTC)

24. POS headings (ongoing individual consolidation discussions - sometimes surprisingly controversial.)

24. POS headings (ongoing individual consolidation discussions - sometimes surprisingly controversial.)
I don't understand what this means.—Stephen 09:16, 2 May 2006 (UTC)
Reducing the number of third level headings in en.wikt:, by combining things like "===Pluralish noun===" or "===Noun phrase===" into 'standard headings' like "===Noun===". --Connel MacKenzie T C 13:38, 2 May 2006 (UTC)
I am in favour of this but again I feel it's too controversial for this discussion. I would love to see it fully fought out and decided elsewhere though. — Hippietrail 22:51, 2 May 2006 (UTC)
Broadly in favour, although trimming everything to "Noun", "Verb", etc, may be too extreme. For example, the OED uses "vbl. sb." (verbal substantive [noun]) and "ppl. a." ("participial adjective") for words like "freezing" in "it's freezing cold" and "this food is suitable for freezing" to show that these derive from the verb "to freeze". I think it is useful to retain parts of speech like "verbal noun" and "participial adjective". On the other hand, "noun phrase" (which should probably be "substantive phrase" anyway) can be trimmed to plain "noun" without loss of useful information - a noun is a noun, and you can tell it is a phrase by seeing that it contains more than one word. — Paul G 07:21, 7 May 2006 (UTC)
I think this should be deferred to the "too controversial" list, for now. All print dictionaries that I know of list parts of speech as abbreviations. Spelling out the meaning of the abbreviation is very confusing to the average reader (who just wants a frikkin definition!) My pet peeve is Transitive vs. Intransitive being split into sepatate headings, instead of a single Verb heading. That however, is spectacularly controversial. Because of the embyonic state of Wiktionary today, the focus most often leans towards assisting entry of translations. As Wiktionary matures, the focus will shift towards making entries readable especially to the average reader. If we want to retain this, on this list, we'll have to specifically exclude my controversial Tr/Intr issue, I think.
I find that "Noun phrase" is utterly useless; I completely agree with Paul that "phrase" should be omitted in that case. OTOH, the "Phrase" heading alone/itself is particularly useful for phrasebook entries. The "Idiom" heading is astoundingly useful; even when a particular form of an idiom can be pigeon-holed into a particular part of speech, other similar forms (that redirect to it) often are not the same part of speech/function! I feel strongly that idioms should be listed under the "Idiom" heading, while any part of speech should be listed at the start of a definition line. That, of course, mimics what print dictionaries do. --Connel MacKenzie T C 08:12, 7 May 2006 (UTC)

Agree with HT. Too controversial for this topic, and it has recently been "discussed" again in the BP. Even though personally, I'd love to see, at last, an exhaustive and final list of POS headings compiled, each with considerable archived arguments for future subdual of contestation. But that'll be for the next round. —Vildricianus | t | 10:12, 7 May 2006 (UTC)

This deserves it's own discussion, but it is worth pursuing to see what POS headers are desired, needed, and used. I thought I had a good handle on these until I started creating Latin entries. --EncycloPetey 14:31, 7 May 2006 (UTC)
I strongly agree this must go to the "controversial" list. Part of the actual resolution will have to be devising a list of headers (confer: /todo2) for discussion. --Connel MacKenzie T C 18:02, 7 May 2006 (UTC)
Even though there are aspects which are controversial, I think we could and should do some consolidation now, as part of this normalization. If you look at Patrik Stridvall's header list at http://tools.wikimedia.de/~stridvall/headers.php under English, you see wild variation. We may not have decided (and may never decide) whether to have "Transitive verb" and "Intransitive verb" or just "Verb", but let's get rid of "Transitive Verb" and "Verb, transitive" and all the other inconsequential variations.

25. Gender templates

  • Replace all f, m, c, n and pl with {{f}}, {{m}}, {{c}}, {{n}} and {{p}} respectively. --Connel MacKenzie T C 06:17, 6 May 2006 (UTC)
Agreed - although, to be clear, I think Connel means typing {{m}}, not {{temp|m}}; is that right?
Correct. I meant {{m}}. --Connel MacKenzie T C 08:01, 7 May 2006 (UTC)
Although this makes no difference to the output, it is possible that some time in the future we might (although I wouldn't like to see it) change {{m}} to expand to masculine rather than m. Having used the template for this everywhere would make this very simple and quick to change. We have already seen this happen with {{p}} which has changed from pl to plural (there is a separate discussion of this elsewhere and so it would be off-topic to discussion here whether this is right or not).
I think that conversation is welcome here. As another side note, I think Hippietrail is doing some experimental magic/css with these particular templates as well. --Connel MacKenzie T C 08:01, 7 May 2006 (UTC)
  • Yes, HT made it possible to choose whether they have a period or not: m vs. m.
  • No; {{p}} is still pl
  • Also, their output is not the same: m has text displaying on hover, which doesn't happen with m
  • On topic: can't this be done by a bot? —Vildricianus | t | 10:18, 7 May 2006 (UTC)
  • NO, IT CANNOT BE DONE BY 'BOT! I had this particular change in my monobook.js for a long time; it worked correctly about 70% of the time - but there are a gazillion cases where it did false-positive replacements that I had to undo by hand. (This was why I started considering the "edit row buttons", so that I'd be able to skip certain types of edits manually.) To fully automate this would be a dreadful mistake. --Connel MacKenzie T C 18:08, 7 May 2006 (UTC)
Why the shouting? I can read lowercase. —Vildricianus | t | 19:18, 7 May 2006 (UTC)
Disagree, because this contradicts what was said above above removing templates from translation tables. I would like to see these gender/number templates used everywhere except in the translation tables.--EncycloPetey 14:31, 7 May 2006 (UTC)
Um, translation tables happen to be the place where they were made for and where they're most used. In most other instances, the unabbreviated word should be mentioned. —Vildricianus | t | 15:02, 7 May 2006 (UTC)
EncycloPetey, unfortunately I think you misunderstood the guideline/recommendation in the section above. This change would not be limited to translation sections, nor would it be limited to non-translation sections. This is about replacing the text ''f'' with {{f}} wherever it occurs. --Connel MacKenzie T C 18:08, 7 May 2006 (UTC)
(Well, everywhere ''f'' is being used as a grammatical gender marker, that is. —scs 15:30, 5 June 2006 (UTC))
Absolutely agree. —scs 15:30, 5 June 2006 (UTC)

26. Categories on one line

Monstrously controversial: At the bottom of a page, since numerous categories add blank lines to the bottom of a page, all categories should be listed on a single line. Note that this is a distinct fork from current practices. Effecting this massive change can only be done with semi-automatic (e.g. WP:AWB) or automatic (pywikipediabot.py) assistance. --Connel MacKenzie T C 16:54, 6 May 2006 (UTC)

Hm, I'm not sure what you are saying, Connel - do you mean they should be typed on one line, or rendered by the software on one line? I was under the impression that categories do not add blank lines to a page (see #20 above). If they do, perhaps the software should be modified so that they do not. Or do you mean all on one line when they are rendered? Probably not, because that is how they appear. I don't like this proposal, in any case. — Paul G 07:21, 7 May 2006 (UTC)
I'm saying that if each category is on a separate line at the bottom of the entry, the MediaWiki software make that line invisible/blank for the next pass of rendering. If there is one blank line, it is not problematic. If there are 20 categories, the intermediate processor will render 20 blank lines, resulting in a screen-full of nothing at the end of the page. Actually, I have not sufficiently tested this, so I'll do so now, on this page. --Connel MacKenzie T C 07:54, 7 May 2006 (UTC)
Well, I guess I was wrong. --Connel MacKenzie T C 07:58, 7 May 2006 (UTC)
Each instance of [[Category:...]] and all its blank spaces it produces is ignored by the software for page layout. —Vildricianus | t | 10:22, 7 May 2006 (UTC)

27. One space after "#"

Definitions shouldn't start immediately after the "#", but after "# " instead. --Connel MacKenzie T C 00:05, 8 May 2006 (UTC)

Agree. —Vildricianus | t | 20:02, 8 May 2006 (UTC)
I agree, but what about following lines with examples which start "#:" - should there be a space after the colon? — Hippietrail 20:06, 8 May 2006 (UTC)
That would be #28.  :-) Please, you make a recommendation as to what's best in that case - I am fairly inconsistent with them. --Connel MacKenzie T C 20:08, 8 May 2006 (UTC)
I usually put a space after #, #:, #*, *: etc, except after a single asterisk. —Vildricianus | t | 22:23, 8 May 2006 (UTC)
I agree, but also, one space after a template after a # e.g. "#{{chemistry}} definition" SemperBlotto 20:51, 8 May 2006 (UTC)
Yes, but I usually also put a space between # and {{chemistry}}. —Vildricianus | t | 22:23, 8 May 2006 (UTC)
  • Perhaps this could be better restated. "One space after line-start wikisyntax." Or: "One space between wikisyntax and visible content." The reason I am inconsistent is that "*:''" is something I confuse with "*: ''". Um, I mean without the double-quotes.
  • Argh. OK, said a different way: line start wiki-syntax characters that can have a space after them, before the content starts, should. Doing so helps newcomers understand which characters are wiki-formatting and which are content-formatting. This "guideline" could apply to stackable ("*", ":", "#") and non-stackable ("{|", "||", "|}") wiki-syntax.
  • --Connel MacKenzie T C 17:19, 9 May 2006 (UTC)
  • So is that also one space after *? If so the perfect format is imperfect. Jonathan Webley 08:17, 6 June 2006 (UTC)
    • I never do that. — Vildricianus 09:57, 6 June 2006 (UTC)

Topics deferred

Thornier issues exist, like definition line "tags." For a while, we've waffled between (tag1), (tag2), (tag3,), (tag4, tag5): tag6: and many others. Recently, I was scolded for ending a list of tags with a colon, as there was already parenthesis there. But it is also improper to have a list of parenthetical items. I think the parenthesis are astronomically stupid looking now. I'd like to see that issue on the BP and voted on. Maybe parenthsis abuse is a British thing, I dunno. But we probably shouldn't have any parenthesis on any definition lines, ever.

This whole topic could also be restarted for each namespace, as each seems to have its own rules: Templates (doc on page, or in talk?,) Categories (crazy pipe partial-sorting thing,) Appendix (pseudo-namespace, Wikipedia=style heading rules), Wiktionary (1/2 Help: format, half blog format) etc.

Right now, I need sleep. More thoughts tomorrow. --Connel MacKenzie T C 08:53, 30 April 2006 (UTC)

  • Let me comment on your points. Please do not change their order or remove any since I'll refer to them by number:
comments moved into corresponding sections above

Definition tags are a problem. However I've been thinking a lot about lists lately and I think we can achive a lot by making them use simple # for editing them, and CSS to format them, probably with some kind of ((list)) and ((end)) templates to contain the CSS classes and id's. Our current systems cannot handle placing multiple tag templates within one set of parentheses and muliple parentheses are ugly. Another simpler idea is to just forgo the parentheses as many print dictionaries do. Italics alone should suffice. My pet peeves are: 1) mixing italicized and non-italicized parentheses. 2) italicized parentheses altogether. 3) use of colons together with parentheses.

The definition tags need perhaps a whole separate conversation. Whatever we end up with has to adhere to the KISS principle. Multiple parenthesis are more than ugly; they are incorrect. More and more, I think we should not have parenthesis at all. Your point #3 is exactly what people complained about me doing. I think multiple parenthesis are far worse, but that is my POV. --Connel MacKenzie T C 23:41, 30 April 2006 (UTC)
As an interim compromise, I would propose that we do recommend using templates for definition tags, always. Rendering (whether italic or not, parenthesized or not, abbreviated or not, etc.) can be handled by the template expansion and by Hippietrail's CSS magic. The only hard problem is how to deal with multiple tags (e.g. UK-specific vulgar computer slang), but I'm willing to defer that, and to live with redundant, uncollapsed parentheses today. —scs 15:36, 5 June 2006 (UTC)

On other namespaces, I think it would be too distracting to worry about them at this stage. I think more can be accomplished by concentrating on the main area first. The rest can be addressed when we see our progress. — Hippietrail 16:32, 30 April 2006 (UTC)

I agree. But I'll be thinking of them as we go along. --Connel MacKenzie T C 23:41, 30 April 2006 (UTC)
Speaking of "topics deferred", do we ever want to address the formatting of pronunciations, the handling of the several different schemes, etc.? —scs 18:05, 5 June 2006 (UTC)
  • WARNING: I plan to put this mess onto subpages. Each of the twenty plus points need separate conversations, apparently. --Connel MacKenzie T C 23:41, 30 April 2006 (UTC)

  • As a fervent normalizer (12k edits that rely heavily on this business), I'd like to add some things. I understand from the above points that this is more or less the desired output:

The "perfect" format

{{see|Word}}
{{wikipedia}}
==English==

===Noun===
{{infl}}

# After a space, a definition with first-letter uppercase and full stop.

====Related terms====
*[[related]]
*[[term]]

====Translations====
'''gloss'''
{{top}}
*Dutch: [[translation]] {{m}}
{{mid}}
{{bottom}}

'''other gloss'''
{{top}}
*Dutch: [[second translation]] {{f}}
{{mid}}
{{bottom}}

===Reference===
{{R:Webster 1913}}

[[Category:Words]]

----

==Spanish==

===Noun===
'''headword'''

# [[#English|word]]

[[Category:Spanish nouns]]

[[es:interwiki]]

Additional comments

Comments

This is not entirely the way I've been doing it but I understand that it's largely what Connel's points involve. Right? Never mind the placement of categories now.
Individual comments moved to appropriate section above.
On tags: I've been thinking and looking to what print dictionaries do, and have found the SOED, which uses italics and small caps, to have a nice format. Thanks to the templates we use for them, we can change this overnight, although we'll need to instate commas where necessary. To give you an idea, it looks like this:
  1. AERONAUTICS Definitions.
  2. NAUTICAL, CHEMISTRY Two tags here.
It's of course also possible to "solve" the tag issue by adding the parentheses manually and leaving them out of the templates, like ({{biology}}, {{chemistry}}). —Vildricianus 18:27, 30 April 2006 (UTC)
Yes, definitely. — Paul G 07:21, 7 May 2006 (UTC)
Note: I added the blank line after the inflection line in your example above, but I left the "incorrect" category placement as that is so touchy. --Connel MacKenzie T C 23:41, 30 April 2006 (UTC)
  • NOTE: (as I said above) I plan to move all of this to subpages here. Each numbered list item above will become a heading and I'll rearrange some of the related ones after adding yours and HT's comments into each section. --Connel MacKenzie T C 23:41, 30 April 2006 (UTC)
  • Hrumph. Maybe I'll make them sub-pages tomorrow. --Connel MacKenzie T C 06:14, 1 May 2006 (UTC)
  • Yup, fine. I'd concentrate for this round on the 20ish points, leaving discussion about tags and category placement for later (the latter is "ongoing" at BP). —Vildricianus 07:59, 1 May 2006 (UTC)
  • I'm finding it difficult to relate the comments to the points, but in general I agree with the sample layout that Vildricianus layed out above. There are just a few points that I would disagree on or don't understand:
Individual comments moved to the appropriate section above.
Oops, I forgot to comment on this first time round. I prefer to put the {{wikipedia}} tag between the POS and the inflections. Not only does this put the Wikipedia link along side the word to which it refers, it also usually avoids adding any blank space to the rendered page. Putting it at the top or elsewhere can often do this, making the page look inelegant. — Paul G 07:27, 7 May 2006 (UTC)
My convention is dependent on the number of headings in an entry. If there are more than three headings, an automatic TOC will be generated. When a TOC is present, it makes sense to have the {{wikipedia}} link above the first heading so that it appears in the white-space to the right of the TOC. But if there are less than four headings, I do exactly as Paul says: between the POS heading and the inflection line. --Connel MacKenzie T C 07:50, 7 May 2006 (UTC)

Regarding terminology: "heading" vs. "header". I understand the confusion that exists with the overlapping of terms with regard to C/C++ ".h" files. However, we are in a Wiktionary context, not a C or C++ context. I use "heading" to refer to a rendered == == section heading and "header" to refer to the wikitext that denotes it. I'd prefer to not change my informal convention, as the distinction is perhaps helpful, sometimes. --Connel MacKenzie T C 13:46, 2 May 2006 (UTC)

Actually there are other senses I was think of such as in binary file formats. Encarta does however give a sense for "header" that is pretty much a synonym for "heading". Collins doesn't but hey this is all just trivia. Back to the topic (-: — Hippietrail 22:55, 2 May 2006 (UTC)

I think this thread is a valuable resource regarding Wiktionary terminology (things like "inflection line", "POS headings" etc.) Anyone feels like expanding Wiktionary:Glossary? Or shall I put it on my Todo? Having a comprehensive reference list of these is beneficial for mutual understanding on all this matter. —Vildricianus | t | 10:29, 7 May 2006 (UTC)

In practice

Please check this edit to see that it conforms to the "perfect entry" and all rules, as above. I suggest this be the format for "phrasbook" entries. Perhaps {{phrasebookdef}} for the definition line? --Connel MacKenzie T C 16:12, 6 May 2006 (UTC)

Specifically rule #23. --Connel MacKenzie T C 17:06, 6 May 2006 (UTC)

Wrap up time?

I think we are nearing the wrap up time for this, as Vild pointed out in the top-most section. I would like to thank everyone who has participated. I also request that each continue to participate.

Vild, HT or I will take this list, glean the least controversial dozen or so items and build a small list to post on the BP in the next day or two. Some rewording of HT's preamble may be in order. My initial list had some pretty bone-headed stuff, that I think you've all helped clear away.

I have enjoyed (perhaps the most) Paul's linguistics lessons, even if I do still disagree. I did not expect this to be so educational, but I am glad it was.

Thank you all again. Let's see where we take this next, eh?

--Connel MacKenzie T C 18:15, 7 May 2006 (UTC)


How about wrapping up? I could make a summary of this, put it in the Beer parlour and expect a couple of people to laugh at us. Most of the above described format rules are exercised by my javascript, though, so given enough time, all entries will move towards this standard (especially the blank lines stuff, which is largely at odds with what most entries have and most contributors do). — Vildricianus 11:57, 5 June 2006 (UTC)

Several unrelated goals here

You guys may be ready to "wrap this us", but remember, some of us just got here. :-) Me, besides the comments I just sprinkled above, I've got one biggie: some of this stuff matters much more (or at least, differently) than others.

There are several quite different goals someone might have in pursuing this normalization:

(A) Make pages look consistent to readers.
(B) Make page source look consistent to editors.
(C) Enable automatic processing, today, e.g. by using templates for gender and definition tags.
(D) Enable machine parsing of wiktionary entries (for other purposes).
(E) Turn Wiktionary into the more-structured database some of us wish it were.

Of these, to me, (B) is the least important, although about half of the entries on the normalization list have to do only with this. A reader doesn't care about blank lines before or after headers, or extra spaces between the == == and the heading text. And a (reasonable) parser doesn't care, either. Obsessive editors might care (and I don't mean to be critical when I say "obsessive" here, as there are plenty of these things that I'm obsessive about, too), but to me, if an entry is complicated and hard to edit I'll add a few blank lines here and there without thinking about it too much, and if an entry is too sparse I'll delete some blank lines without thinking about it too much, and I really don't worry about "standardizing" this sort of thing. A guideline for the preferred formatting at this level would certainly be useful, but it doesn't have the same kind of urgency as the other four goals I've listed.

Of these, to me, (D) and (E) are quite important, although I know that there are lots of editors for whom they're not important at all, who are just as uninterested in machine parsing as I am in normalizing newlines. And there's nothing wrong with that, either; there are legitimately multiple goals here (only partly overlapping, or sometimes wholly distinct). (So what I'm saying here is that it doesn't bother me if someone is uninterested in machine parsing, or interested in blank lines, as long as they're not bothered by the fact that I'm not.)

But I bring this up because I think the larger audience is going to be equally struck by what they'll see as the same kind of differential importance of these issues. So I think we need to at the very least split them up, and perhaps even address them as somewhat separate subprojects. Here's my take:

issues that matter for proper functioning of existing software 
3, 4, 5
issues that are visible to readers 
7, 8, 18, 23, 24
issues that enable wiktionary-specific processing, visible to readers 
25, (definition tags templateized)
issues that are visible only to editors (cosmetic)
1, 2, 6, 10, 11, 12, 13, 14, 15, 16, 19, 27
issues that are visible only to editors (possibly more significant) 
20, 21, 22
issues that matter for machine parsing 
7, 8, 20, 23, 24, 25, (definition tags templateized)
not sure (including stricken) 
9, 17, 26

There's some overlap there; in particular, everything I've listed as "issues that matter for machine parsing" is also in some other category (usually, interestingly enough, "issues that are visible to readers").

I have included "definition tags templateized" (which was in Connel's "Topics deferred" section), because I think it's as least as interesting as some of the other issues listed here, but not as controversial as some. Call it #28?

scs 16:30, 5 June 2006 (UTC)

[P.S. What's the difference between (D) and (E)? Maybe nothing significant, but I was thinking there's a difference between wanting to leave the door open to building a structured database later by trying to keep Wiktionary entries somewhat machine parseable now, versus actually trying to build and have a structured database now. At any rate, and despite the fact that I'd really like to have something more structured, it's important to realize that there's actually a pretty huge confluct here. MediaWiki is not structured, so an attempt to impose a lot of formal structure, formal enough to allow machine processing, but to impose it merely via ad hoc editing conventions which everyone is expected to remember and follow, is a recipe for a lot of friction and disappointment. —scs]

1/ I have a gut feeling that there are more editors than readers. 2/ This list was initially intended for the hardcore normalizers like Connel, Hippietrail and a few others. Other options got stuffed in afterwards and it turned out to become a more general discussion. So (B) was really the first purpose of this thread. Having the blank lines appearing correctly is not very important, but fun for those who care and easy for a javascript to add. It also allows for quickly spotting "bad" entries, i.e. those with everything compressed, which is usually done by the less experienced here, and which are therefore possibly "bad". 3/ The other options you presented are also worked on, for instance with the inflection templates. I'm not sure though how urgent they are. Machine-parsing Wiktionary has only been done by the few (Connel, Patrik,...) who wanted to put up todo lists or stats. There's much more work on the actual content before we should consider any other fancy option. That doesn't defer normalizing for machine parsing, though. But it's not "urgent". It's work in progress just as much as the other jobs are. Compare the format of two years ago with the current. Also keep in mind that it changes constantly. 4/ I'm really bothered by the fact that you don't like the blank lines :-(. 5/ Thanks for commenting on it. I guess it's a topic that will pop up from time to time in order to get one another's confirmation of existing practice (which is what I think Hippietrail intended to do). — Vildricianus 17:47, 5 June 2006 (UTC)
Agreed that machine parsing is a longer term work-in-progress -- but note that it applies even though (nay, because) the format "changes constantly": if/when we do a big format change, the more we can automate it, the happier we'll be. (Now, as for those cotton-picken blank lines, I don't just "not like" them, I can't stand them, I fly into a blind rage when I see them, I delete them on sight, I'm really bothered that you could consider sneaking your slovenly blanklineist POV into this vital emerging policy...) —scs 18:27, 5 June 2006 (UTC)
  • I wish to say that I support any coherent, logical subdivision of this monster (that I thought at first was so short and sweet) subpage. --Connel MacKenzie T C 18:20, 5 June 2006 (UTC)