Monday, February 27, 2012

Wiktionary - Recent changes [en]: Wiktionary:Beer parlour

Wiktionary - Recent changes [en]
Track the most recent changes to the wiki in this feed. // via fulltextrssfeed.com
Wiktionary:Beer parlour
Feb 27th 2012, 12:51

Line 2,073: Line 2,073:
   
 

: See [[Rhymes:English:-ɛri]] for how we handled one case where some dialects of English exhibit rhymes that others do not. I don't know if that's the only approach we're using for English (we're not famously consistent about these sorts of things), and maybe it's not the best approach for Catalan; but it's probably a decent starting-point. —[[User: Ruakh |Ruakh]]<sub ><small ><i >[[User talk: Ruakh |TALK]]</i ></small ></sub > 03:51, 27 February 2012 (UTC)

 

: See [[Rhymes:English:-ɛri]] for how we handled one case where some dialects of English exhibit rhymes that others do not. I don't know if that's the only approach we're using for English (we're not famously consistent about these sorts of things), and maybe it's not the best approach for Catalan; but it's probably a decent starting-point. —[[User: Ruakh |Ruakh]]<sub ><small ><i >[[User talk: Ruakh |TALK]]</i ></small ></sub > 03:51, 27 February 2012 (UTC)

  +

:: That approach is used for some Catalan rhyme pages as well, but the issue is that currently our Catalan rhyme pages use the schwa phoneme (in the title), which exists in Central Catalan but corresponds to two different phonemes in Valencian. This means that the words on for example [[Rhymes:Catalan:-onə]] might rhyme in Central Catalan but not in Valencian, where they would be differentiated into -ona and -one. So the question is whether there should be [[Rhymes:Catalan:-ona]] and [[Rhymes:Catalan:-one]] with a notice like the one you mentioned, even though Central Catalan doesn't have unstressed -a or -e. —[[User:CodeCat|CodeCa]][[User talk:CodeCat|t]] 12:51, 27 February 2012 (UTC)

   
 

== Proposal - complete unified login for all eligible accounts ==

 

== Proposal - complete unified login for all eligible accounts ==


Latest revision as of 12:51, 27 February 2012

Wiktionary > Discussion rooms > Beer parlour

Wiktionary discussion rooms (edit) see also: requests
Information desk
comment | history

Newcomers' questions, minor problems, specific requests for information or assistance.

Tea room
comment | history | archives

Questions and discussions about specific words.

Etymology scriptorium
history

Questions and discussions about etymology- the historical development of words.

Beer parlour
comment | history | archives

General policy discussions and proposals, requests for permissions and major announcements.

Grease pit
comment | history | archives

Technical questions, requests and discussions.

All Wiktionary: namespace discussions 1 2 3 4 5 - All discussion pages 1 2 3 4 5
Lautrec a corner in a dance hall 1892.jpg

Welcome, all, to the Beer Parlour! This is the place where many a historic decision has been made and where important discussions are being held daily. If you have a question about fundamental Wiktionary aspects—that is, about policies, proposals and other community-wide features—please place it at the bottom of the list (click on Start a new discussion), and it will be considered. Please keep in mind the rules of discussion: remain civil, don't make personal attacks, don't change other people's posts, and sign your comments with four tildes (~~~~), which produces your name with timestamp. Also keep in mind the purpose of this page. There are various other discussion rooms which may serve the idea behind your questions better. Please take a look to see which is most appropriate.

Sometimes discussion identifies an issue as an idea for policy development or rewriting. Such discussions may be taken out of the Beer parlour to a relevant page, or a brand new page may be created. Usually, the active policy pages will be listed in one of the sections below. See also the policy development page and the votes page.

Questions and answers will not remain on this page indefinitely, as it would very soon become too long to be editable. After a period of time with no further activity (usually a couple of weeks), information will be moved to the archives. We make a point to preserve all discussions that were started here in the archives. However, talk that is clearly not intended for this page may be moved and will not end up in the archives. Enjoy the Beer parlour!

Beer parlour archives
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
All subject headings

Contents

[edit] Some questions...

Okay, so I've been away from Wiktionary for a while...Recently though, I've started creating more categories and doing other little things. What has irked me slightly though is looking through my deleted contributions list. So broadly speaking, this is my question: Why the frigging hell are there so many deleted categories?! O.o To be sure everyone's clear on what I do and don't get, here are a few points to elaborate:

  1. Yeah, yeah...I get that "mainspace-isation" of appendices idea got lynched so I know why all the anime categories and shit are gone.
  2. Okay, perhaps some topical categories I created shouldn't exist so in such cases I understand that too.
  3. This is something I just don't get tbh...some categories I made deleted simply for being empty. Some of these were most likely created in a chain process. That is to say, I created a topical category listed on Special:WantedCategories then that in turn created a redlink to another category (which, in some cases due to lack of existing material on the subject in the language in question, may have been empty).

50 Xylophone Players talk 01:15, 2 December 2011 (UTC)

A lot of categories were renamed. See Wiktionary:Votes/2011-04/Derivations categories and Wiktionary:Votes/2011-04/Lexical categories. --Yair rand 01:38, 2 December 2011 (UTC)
Xylo, shouldn't that special page you speak of be broken down by language? Purplebackpack89 (Notes Taken) (Locker) 01:41, 2 December 2011 (UTC)
To add to Yair rand's comment, in case you don't already know — categories can't be moved, only deleted. Really, the deletion summary should point to the new category, but (somewhat understandably) not everyone bothers to do that, and (unfortunately) it's basically impossible to fix after-the-fact. —RuakhTALK 02:51, 2 December 2011 (UTC)
I've on occasion deleted something, realized I should have included a good summary, so created/restored it and deleted it again. I assume the same can be done for a category.​—msh210 (talk) 19:01, 2 December 2011 (UTC)
Okay, thanks for the replies.
@Purplebackpack89, I don't mean to sound rude but how should I know? Maybe that would be a good idea yea, but those Special pages are automatically generated, not created by contributors.
@Ruakh & Yair, here's an example of something deleted simply for being empty: Category:ru:Marxism. Just to be safe, I added a preexisting entry to it after recreating it. :P 50 Xylophone Players talk 12:31, 2 December 2011 (UTC)
Xylo, the reason I'm asking is the first thing that came to mind is that the list is highly disorganized to the point of non-navigability. If they're automatically generated, maybe get better bots? I'm not faulting any one editor for the disorganization, merely pointing out that the problem would be easier to remedy if the category was organized Purplebackpack89 (Notes Taken) (Locker) 19:36, 2 December 2011 (UTC)
It's not a category, but a special page. It's not created by a bot, but built into the software. If you want to create a project page that you believe to serve the same purpose, but better, then please be my guest. —RuakhTALK 20:24, 2 December 2011 (UTC)

[edit] Letter entries

Many of our letter entries are really useless.

Take for example . It is defined as "The letter E with a circumflex and an acute accent." That is not a definition, it is an etymology. It's comparable to defining dictionary as "A word composed of ten letters." It doesn't convey any sort of meaning. Therefore, I propose we delete the whole bunch of letter entries which contain no content beyond that. -- Liliana 14:14, 4 December 2011 (UTC)

My problem is with letters which aren't really used in any language. Ungoliant MMDCCLXIV 15:09, 4 December 2011 (UTC)
Most dictionaries do have letter entries summarising the origins and pronunciation of the letter (e.g. Chambers for E has "the fifth letter in the modern English alphabet, with various sounds, as in me, get, England, her, prey, and often mute, commonly indicating a preceding long vowel or diphthong"). Letters do not really have a "meaning". Equinox 15:14, 4 December 2011 (UTC)
[after e/c] I think it's more comparable to defining dictionaries as "Plural form of dictionary." The main defect, in my opinion, is the non-use of {{non-gloss definition}}: it doesn't *mean* "the letter E with a circumflex and an acute accent", it *is* the letter E with a circumflex and an acute accent. —RuakhTALK 15:15, 4 December 2011 (UTC)
In case you haven't read the main page Liliana, I think this paragraph is of at least some relevance:
"Designed as the lexical companion to Wikipedia, the encyclopaedia project, Wiktionary has grown beyond a standard dictionary and now includes a thesaurus, a rhyme guide, phrase books, language statistics and extensive appendices. We aim to include not only the definition of a word, but also enough information to really understand it. Thus etymologies, pronunciations, sample quotations, synonyms, antonyms and translations are included."
(emphasis added) 50 Xylophone Players talk 16:53, 4 December 2011 (UTC)
Keep 'em, I've actually used these pages to identify weird letters I've found in entries by copying and pasting them into the search bar. Something like "The letter E with a circumflex and an acute accent." would provide useful information to me. Mglovesfun (talk) 18:15, 4 December 2011 (UTC)
I agree, although I admit I hadn't though of using that approach before. There are some characters I can read on some browsers, but cannot read on others. IPA characters often don't display in certain versions of browsers, but if I paste one into a Wiktionary search tool, the entry for that symbol would pop up. The same is true for weird central and south Asian scripts that almost never display for me. --EncycloPetey 04:18, 9 December 2011 (UTC)
@Palkia: You must not have been comparing our English definitions with those contained in the better online English dictionaries. The quality of our definitions is often quite poor, often no better than Webster 1913 and often worse. We have expanded our coverage in peripheral areas without addressing the main goals that users usually have for their dictionaries. In English, for synonyms and antonyms I would recommend Dictionary.com; for definitions MWOnline; for quotations Wordnik (or Google); for statistics COCA, COHA, and BNC; for etymologies Online Etymology Dictionary, for pronunciations almost any online dictionary. We seem to have the edge in translations. DCDuring TALK 21:04, 4 December 2011 (UTC)

[edit] Possessive Pronouns vs Possessive Adjectives

This has probably been mentioned several times, but I think Wiktionary really needs to resolve and decide on the issue of possessive pronouns vs. possessive adjectives, just for the sake of consistency if for nothing else. The words in question include 'my', 'your', 'his', 'her', 'its', 'our', and 'their'. As of now, most of them are listed as "possessive pronouns", but 'my', 'her', 'its', and 'our' are listed as "possessive adjectives", and there is some disagreement about what they should be called, as a few people have insisted that they are traditionally considered pronouns. Most dictionaries I have seen, on the contrary, consider these words possessive adjectives rather than pronouns, including Merriam-Webster, but a few do consider them pronouns. I'm on the side of the adjectives, or at least possessive determiner, because I don't see them standing in for nouns, as opposed to 'mine', 'yours', 'his', 'hers', 'ours', 'theirs', which are only pronouns.

The other reason it's important to resolve this issue is because the translations for other languages have discrepancies with their English entry equivalents as well. For example, French 'leur' and Italian 'loro' are considered to have a possessive adjective meaning- "their" (in addition to a separate function as a possessive pronoun, "theirs", but the difference is made clear). The authoritative dictionaries of these languages also list these words as having a possessive adjective use. If one clicks on a translation link to one of these from an English entry, where the word is only a pronoun, they'll end up at a Spanish page where it is an adjective and it might confuse them. Either way, some articles are going to have to be changed for these words to match up correctly.

(the reason I don't want to just edit these myself without consulting here is because I don't want people to get mad about changing a page for a common or important word)
Word dewd544 18:22, 4 December 2011 (UTC)

For English, good reasons can be given for any of "possessive adjective", "possessive determiner", or "possessive pronoun", and though my preference is ===Determiner===, I'd be fine with ===Adjective=== or ===Pronoun===. What I would not be fine with is forcibly transposing this classification into other languages. Whether their is an adjective or a determiner or a pronoun has little bearing on whether leur or loro is. —RuakhTALK 18:46, 4 December 2011 (UTC)
Historically the first- and second-person forms are adjectives or determiners; they inflected as such in Old and Middle English. The third-person forms were historically the genitive form of the pronoun, but they later came to be inflected as adjectives or determiners as well by analogy (the situation in the Romance languages is similar, but only 'loro' and its varieties is a genitive, from Latin illorum). I prefer determiner to adjective myself, although I'm not quite sure how they can be distinguished. —CodeCat 21:41, 4 December 2011 (UTC)
Keep in mind that this will not necessarily be the same for every language. For English, I agree with the comments above that Determiner is probably the best header, although I'd like to see a (possessive) context and a cross-categorization of the terms in a sub-category of pronouns for people searching that way. However, they inflect and act more like adjectives in Latin and in Romance languages like Spanish. So, in those languages they might have a different header. Words that translate each other between languages do not always share the same part of speech grammatically in the two languages. As an extreme example: the English word year is a noun, but its Navajo translation is a verb. There isn't a noun for that term in Navajo. Any decision about how a set of words are to be classified must necessarily be a language-specific decision. It does not apply automatically to other languages because each language will have differences in grammar from other langauges. This includes differences in the parts of speech. --EncycloPetey 21:47, 4 December 2011 (UTC)
From what I've seen, the distinction between adjectives and determiners is one of semantics and syntax, not about inflection. So we can't use the type of inflection to determine whether something is an adjective or not. The fact that in Dutch and several Romance languages the possessives can be preceded by a definite article may be significant, though. —CodeCat 22:47, 4 December 2011 (UTC)
Regarding the issue of translation between languages: fair enough, we don't need to worry about that, I guess. But as for consistency within English, we can agree to allow them to be called just determiners, with a 'possessive' context added? I'm not overly concerned; it's just that I hope people won't take the dictionary less seriously by seeing half of these listed as pronouns and others as adjectives when they're the same part of speech. -Word dewd543 20:05, 8 December 2011 (UTC)
I'd support that approach for English. --EncycloPetey 04:14, 9 December 2011 (UTC)

The confusion here can partly be attributed to the loose use of the term determiner, which is widely used both as a lexical category (or POS) term (equivalent to noun, adjective, preposition, etc.) and as a functional term (equivalent to subject, object, complement, modifier, etc.). Our headings should be based on lexical category, not function. The lexical category of the items in question is pronoun. They are the dependent genitive case. As I is to Brett, my is to Brett's. This group of pronouns function as determiners in NPs, just as genitive NPs do. That doesn't mean either of them belong to the same lexical category as words like the, each, many, and all.--Brett 13:04, 12 December 2011 (UTC)

[edit] Inclusion of Unish as a language.

Recently I blocked a user for adding an entry with an unrecognized language, Unish. There is no reference to it in Wiktionary and a brief look on the Net afforded me nothing. One of the authors of this presumably new constructed language contacted me on my talkpage regarding this. I am not familiar with the criteria required to include a new language in Wiktionary, so I am opening a discussion here. JamesjiaoTC 02:58, 5 December 2011 (UTC)

I'm just going to preemptively vote Symbol oppose vote.svg Oppose -- Liliana 03:01, 5 December 2011 (UTC)
I'll listen to arguments, but would like to see an ISO 693-3 code, and I'd like see ISBNs of books written in (and preferably not on) Unish.--Prosfilaes 05:00, 5 December 2011 (UTC)
First, an ISO language code is a prerequisite. Second, it would still have to be approved, which is very unlikely for any minor or new constructed language. See Category:Appendix-only constructed languages and Wiktionary:CFI#Constructed languages. —Stephen (Talk) 16:07, 5 December 2011 (UTC)
Correction, we do have languages without ISO 639 codes, such as {{roa-jer}} (Jèrriais). Mglovesfun (talk) 16:53, 5 December 2011 (UTC)
We don't have constructed languages without ISO codes, though. -- Liliana 17:02, 5 December 2011 (UTC)
My point was a mere correction, I didn't mean anything by it. Mglovesfun (talk) 17:06, 5 December 2011 (UTC)
It'd have to be approved, but asking for ISBNs of books written in (and not on) Unish is a pretty high standard, one that several of currently approved languages might not meet (and I might argue for removing them for that). It's not perfect, but it's a good sign this is a real language, and not just a project.--Prosfilaes 13:09, 6 December 2011 (UTC)
I'm curious—which languages are you referring to? -- Liliana 13:17, 6 December 2011 (UTC)
Esperanto and Interlingua are on solid ground; I believe Ido and Volapük can also meet that qualification. (Not necessarily ISBNs for the older languages, but seriously printed material.) I don't believe that Occidental or Novial can. And I'm pretty sure that the only published material in Lojban is The Complete Lojban Language.--Prosfilaes 13:58, 6 December 2011 (UTC)
I have to agree on Lojban, it seems pretty useless for our purposes and I have no idea who would use that language seriously. Occidental might have historic publications (haven't checked), Novial... dunno, likely not. Incidentally, these were only approved because we have Wikipedias in these languages, so yeah. -- Liliana 15:05, 6 December 2011 (UTC)

Le lojbo karni has a record at the Library of Congress. —AugPi (t) 14:56, 16 December 2011 (UTC)
That's interesting evidence, thanks a lot. -- Liliana 13:52, 26 December 2011 (UTC)
I've also always wanted to know where Lojban is used. Mglovesfun (talk) 15:18, 6 December 2011 (UTC)
To be honest I have never seen that language outside the lojban.org website. It really seems to be just someone's pet project. -- Liliana 15:22, 6 December 2011 (UTC)
There are 2603 sentences in Lojban in tatoeba.org, which makes Lojban #35 out of 94 languages in tatoeba.org. Since Lojban is based on predicate logic, it could well be interesting to logicians. Ward Cunningham brings up (in his wiki) the question of whether Lojban could be useful for Artificial Intelligence. Duncan College at Rice University has offered a course, COLL 109, on Introductory Lojban, worth 1 college credit. Lojban has a formal grammar which enables one to (1) unequivocally determine whether a sentence is Lojbanic or not, and if the sentence is Lojbanic, then (2) it can only be parsed, unambiguously, in exactly one way. The formal grammar is "implemented", e.g., through a parser/translator available online under the name "jboski". Those two points make Lojban rather unique, even among artificial languages, and might offset the fact that it does not have much literature (as Ward's Wiki says, the language is at an "embryonic" stage (and whether it could "take off" is anybody's guess: IDK)). Anyway, since Lojban has its own Wiktionary, which could have articles linking to en.wikt, I think it would be in the interest of "diplomatic reciprocity" (?) for en.wikt to allow Lojban articles in its mainspace. —AugPi (t) 17:59, 12 December 2011 (UTC)
Looks like this is not getting approved, which is fair enough. The individual who wrote me isn't obviously motivated enough to engage in this discussion either. Just some further info on this language - their homepage [1]; the person I blocked but has now been unblocked under the condition that he/she does not make further entries in this language without engaging in this discussion first - User_talk:K11312. I have tried emailing this user, but it says that there is no valid email address associated with the account. I suggest giving them another week to respond before closing this discussion off. JamesjiaoTC 20:58, 5 December 2011 (UTC)
No response from OP... So this discussion is now closed. JamesjiaoTC 21:14, 11 December 2011 (UTC)

[edit] Chinese topic cat. question

Okay...so why exactly are categories of format "Category:cmn:Topic X in simp./trad. script" being put into Category:Categories needing attention? 50 Xylophone Players talk 20:56, 5 December 2011 (UTC)

This is related to this topic, but might not be on this topic. Which category format are we supposed to use right now? This [[Category:cmn:Plants]] (without specifying the script) or this [[Category:cmn:Musical_instruments_in_simplified_script]] (specifying the script)? JamesjiaoTC 21:06, 5 December 2011 (UTC)
I would prefer the script specific ones seeing as the functionality has been added (or was it already there?) to {{topic cat}} to have a script parameter. Furthermore, they are certainly being used; someone (dunno who since I haven't been checking histories) has transferred/added lots of entries to the script-specific categories. Just look at this one that I created earlier: Category:cmn:Musical instruments in traditional script. It is a good idea IMO, as of course, a user may only be interested in trad. or simp. Mandarin. 50 Xylophone Players talk 00:02, 6 December 2011 (UTC)
"has been added" - I doubt it works properly just yet. -- Liliana 01:17, 6 December 2011 (UTC)
I think this (which category format to use for cmn) needs to be sorted first before looking at why "Category:cmn:Topic X in simp./trad. script"'s in "Categories needing attention". I am thinking that the sc parameter of subject qualifier templates such as {{physics}} should somehow produce a category for the corresponding script for cmn instead of just cmn:Physics, that is, if people here prefer to have separate categories for the two scripts. In summary, sc=Hans should produce "Category:cmn:Physics in simplified script", sc=Hant should produce "Category:cmn:Physics in traditional script" and sc=Hani should produce both!. Just a thought. With this sorted, we can then look at pulling those categories out from "Categories needing attention". JamesjiaoTC 01:32, 6 December 2011 (UTC)
I had a go at adding the function for Hans/Hant, but it didn't seem to work. Not sure why I didn't post it here straight away. Mglovesfun (talk) 10:09, 6 December 2011 (UTC)
I don't feel as though there is much to "sort" to be honest, James. Using the script-split categories seems just fine and as I said, they are being used; Special:WantedCategories is already flooded with them. Dunno if I'll be able to manage it but I think I'll just go and look at the template myself and seem if I can sort it out. Oh, and if I'm understanding what you're saying about the sc parameter correctly then as far as I can see it IS currently working so that sc=Hans should be used for simp. categories and sc=Hant should be used for trad. categories. 50 Xylophone Players talk 17:04, 6 December 2011 (UTC)
Dammit...could someone take a look at {{topic cat}} and see if the can figure out a solution for me? This is what I did, which works fine but...apparently also seemed to uncategorise topical categories that were there for other reasons. For example, one that I fixed had the wrong topic specified in {{topic cat}} on the page but after I changed said template the category was no longer in the attention category even before I fixed it. 50 Xylophone Players talk 18:23, 6 December 2011 (UTC)
The naming of the categories is handled using a separate template, {{topic cat name}}. So if the naming scheme is changed, changes should be made there. The templates {{topic cat parents}} and {{topic cat description}} should also be changed. —CodeCat 19:36, 6 December 2011 (UTC)
I've made the changes now, but I'm not sure how the categories should be categorised. Should Category:cmn:Musical instruments in traditional script appear in Category:cmn:Music, in Category:cmn:Music in traditional script, or in Category:cmn:Musical instruments? Or more than one of those? Right now it appears only in the first. And should it appear also in Category:Musical instruments? —CodeCat 20:35, 6 December 2011 (UTC)
First, to directly answer your question, really for consistency it should be in Music in trad. script, at least IMO. More generally, the way I see it I think it would be best to handle it in either of the following two ways:
  1. Abolish the normal Mandarin topical categories altogether and just keep the script-specific ones.
  2. Keep the normal Mandarin topical categories but only have them (perhaps only to a certain extent) as categories to contain the script-specific ones. That is to say (and yea, I know this example is something of a simpler case) a similar structure to what I organised for Category:Hungarian noun forms ; a category only to contain subcategories.
I'm preferring the second option myself, really. 50 Xylophone Players talk 20:48, 6 December 2011 (UTC)
To answer your question, CodeCat, Category:cmn:Musical instruments in traditional script should appear both in Category:cmn:Music in traditional script and Category:cmn:Musical instruments, in my opinion. -- Liliana 22:05, 6 December 2011 (UTC)
Yea, I think that layout would work fine. 50 Xylophone Players talk 00:02, 7 December 2011 (UTC)
I'm confused about this discussion but I want to mention that Chinese categories shouldn't be renamed, deleted lightly, without checking what happened with the entries containing them. Renaming, deleting categories should be the last thing. Perhaps non-empty categories for any language should not be deleted/moved if the contained entries are not modified as well. --Anatoli (обсудить) 22:34, 8 December 2011 (UTC)
I don't know what you mean if you are trying to refer to any specific incident but to make a general comment, trust me, I really don't mind working at stuff here like creating categories and fixing entry categorisation (for example, the transition from "Category:Topic x" as English category to "Category:en:Topic x", with "Category:Topic X" as something of a metacategory). So what I'm saying is, if any categories are to be renamed I'll gladly help when I can with the moving of entries from one category to the next. As for your confusion about the discussion, this is pretty much what the end up of it all is: The layout of Chinese (well, at least Mandarin but surely the same would be fine too for Min Nan and any other "dialects" that use trad. and simp. Han characters) topical categories that Liliana suggested, and I agree with is as follows:
That is to say, that the category structure would have both of the above features. 50 Xylophone Players talk 00:39, 11 December 2011 (UTC)
There is a serious drawback about this structure... Category:cmn:Musical instruments will contain both the two 'child' categories Category:cmn:Musical instruments in traditional script and Category:cmn:Musical instruments in simplified script, as well as the two script-specific 'sibling' categories Category:cmn:Music in traditional script and Category:cmn:Music in simplified script. This could potentially make it harder to navigate because the children and siblings are mixed together (this problem is why we added the prefix en: to English topical categories). —CodeCat 01:05, 11 December 2011 (UTC)
I guess it doesn't change the point you're making, but it would be Music -> Musical instruments, not vice versa. Furthermore, as far as I can see if this mixture of "children" and "siblings" is a problem IYO then I hate to tell you but it still exists in a way. Just look at Category:Music; It has both siblings like Category:en:Music, Category:fr:Music, etc and true child categories like Category:Musical instruments, Category:Opera, etc as children. This script-split scenario is a special case in way and I think it would be alright to emulate the top level topical categories. Is it really that problematic? May I make a suggestion, and ask is it possible that something somewhere could be altered to separate the upper- and lowercase categories in the subcategory lists? That is to say, Category:en:Music would be indexed under "e", not "E" while Category:Musical instruments would remain sorted under "M" and not "m". That would sort out the top level categories at least.
As for the Mandarin subcategory scheme Liliana put forward, and I drew out, I realise that would not work as we are talking about
"Category:cmn:xxxx" vs. "Category:cmn:xxxx in simp./trad. script". Consequently, I have two suggestions:
  1. Would there be some way to separate out the script specific categories and the true children into separate parts of the subcategory lists?
  2. Would adding extra language codes (since I know we don't use solely ISO codes) for script specific things be out of the question? TBH,I'd prefer not to have to deal with this unless it could be handled well by a bot. What I'm getting at is, say we made "cmn-t" and "cmn-s" the codes for traditional and simplified Mandarin. Then, all the categories of format "Category:cmn:xxxx in traditional script" would have to be moved to "Category:cmn-t:xxxx". So of course, then all the entries would have to be edited too...so yea, that'd be an awful lot of work. 50 Xylophone Players talk 22:50, 11 December 2011 (UTC)

I'm rather queasy about the use of an image of a convicted criminal here. Could some others have a look and give your opinion? After all, we're dealing with a human being, not an inaminate object. __meco 13:43, 6 December 2011 (UTC)

That looks a lot like a violation of w:personality rights to me. But I want other opinions. -- Liliana 13:49, 6 December 2011 (UTC)
I'm fine with it. The photo is already publicly available, and the caption is neutral and unemotional: it just says he is a convicted murderer. Equinox 13:50, 6 December 2011 (UTC)
At the risk of side-stepping, I think this sort of debate should be on Wikimedia Commons rather than on here. w:personality rights is a bit tricky, looking at the USA section, it often conflicts with the first amendment so as non-legal experts, it's hard or impossible for us to form an opinion that might later turn out to be legally valid. Mglovesfun (talk) 15:26, 6 December 2011 (UTC)
I agree. (By the way, I think that ideally, our content should be legal outside the U.S. as well; obviously we don't have any legal obligation to protect our mirrors from violating the laws of their own respective countries, but it would still be nice to minimize such issues.) —RuakhTALK 18:01, 6 December 2011 (UTC)

Guys.. I'm not raising the issue of legality. I'm simply pointing out that we could consider the human implications of this image and maybe try and find a way to do this which causes the least harm. Starting with, is this the mug shot we want to go with? Perhaps using an older image of a deceased person would be better? __meco 19:43, 6 December 2011 (UTC)

I am with you, there are plenty of mug shots of long since dead people, no reason to use one of a living individual. I am also with Ruakh, when it doesn't negatively impact the goals of the project we should do our best to be easily accessible and mirror-able. - TheDaveRoss 20:34, 6 December 2011 (UTC)
I am completely unbothered by our usage of this image — such mug-shots appear regularly in the news, even of people who end up not being convicted — but I would also be completely unbothered by you (or TheDaveRoss, or anyone else) changing it to an older image of a now-deceased person. —RuakhTALK 20:38, 6 December 2011 (UTC)
Oddly, [2] yields no results.​—msh210 (talk) 23:29, 6 December 2011 (UTC)

The image has an emotional impact, and provokes questions in the reader's mind about its appropriateness, even if we have resolved those questions as moot. An image that is clearly historical or fictional would be better as a pure academic illustration of the concept, and not a very personal-looking portrait of this particular guy.

There's no reason to show a living person, nor a child molester, nor someone whose victims' family and acquaintances might actually try to use Wiktionary to look something up. Michael Z. 2011-12-07 04:45 z

A mug shot of notorious bank robber and murderer w:Baby Face Nelson
I've taken the liberty of replacing the mugshot with Baby Face Nelson's. Michael Z. 2011-12-07 04:49 z
I'm happy with it. __meco 10:10, 7 December 2011 (UTC)

[edit] Odd glitch in Ladino suffix categories

There seems to be something wrong somewhere as in Category:Ladino words suffixed with -ero and Category:Ladino words suffixed with -ar (and quite possibly any other suffix categories in Ladino) the title on the page is appearing as "Category:Ladino words suffixed with ero-/ar-". Could someone fix this problem please? 50 Xylophone Players talk 21:15, 7 December 2011 (UTC)

This occurs because {{lad/script}} produces Hebr, so the -ero and -ar and so on are getting marked as right-to-left. I suppose the first question is — in what scripts do we want to include Ladino? If we only want Hebrew-script, then these categories should be deleted (problem solved), and if we only want English-script, then we should change {{lad/script}} to produce Latn. If we want both scripts, then either (1) we need to choose one as the default, and accept the use of sc=... for the other; or (2) we need to invent some language-code distinction between English-script Ladino and Hebrew-script Ladino. I have some thoughts, but I'd rather get input from someone who really knows Ladino. (I'm considered "Sephardi", but technically I'm actually Mizrakhi. My background is Persian, not Iberian. Not that I speak Bukhari, either!) —RuakhTALK 21:30, 7 December 2011 (UTC)
FWIW, I've done some research into this... at least enough to make me abandon the idea of learning Ladino myself. Historically, the language was only written in Hebrew script, but this has started to change over the past 50 years. Many dictionaries and grammars are using a transcription into Latin characters rather than Hebrew script. However, there isn't a standardized transcription used, and even individual books on Ladino that I've read will explicitly state that their transcriptions are somewht dynamic with pronunciation, rather than strictly transcribing the written characters. So, the various dictionaries and texts in Latin scripts seldom match spellings of words. --EncycloPetey 04:11, 9 December 2011 (UTC)

[edit] Nouns and proper nouns

I've been wondering for a while whether it is really a good idea to distinguish between common and proper nouns in headers and categories, and I can't really come up with any good reason why such separation is needed. I can think of several reasons, however, for why it would be good to merge them under a single ===Noun=== header:

  1. Simpler entry structure for entries like e.g. Egyptian, where there is a common noun sense (person from Egypt) and a proper noun sense (language) and these are both really the same word with the same etymology
  2. The distinction between common and proper nouns is not always clear, sometimes even disputed, with linguistic arguments and convention sometimes at odds. Non-distinctive labeling can thus spare editors quite a headache in some cases (language names and nationalities are those words which are most confusing). And what about names for people? Take Jane, for instance: in "I went to the park with Jane," Jane is grammatically clearly a proper noun, right? But in "There's a Jane in the office, and two more Janes work at your school," it seems that Jane is a common noun meaning any person whose name is Jane. It doesn't seem right to have a separate header (or even a separate sense) for this, because this usage possibility exists with all personal names, and indeed with all proper nouns. Nouns that are usually common can similarly be used like/as proper nouns, e.g. mother: "I went to the park with mother," whereas e.g. "my mother" or "all mothers unite" clearly display commonness. Common nouns can be used to make completely transparent proper names on the fly, etc., etc.
  3. For some languages, there are no grammatical features which indicate proper noun usage, yet the proper noun header is being thrust upon them, seemingly indicating a grammatical difference. These are e.g. the Slavic languages (barring perhaps Bulgarian/Macedonian, which have definite suffixes), Latin and Ancient Greek.

It seems to me to be more of a semantic distinction (though sometimes having some bearing on which words may be used alongside the main word, etc.). Compare e.g. the different categories of pronouns, grouped under the ===Pronoun=== header. What do you guys think? – Krun 02:40, 8 December 2011 (UTC)

You just said what I always wanted to propose to the community. I completely agree. There is no grammatical difference between Noun and Proper noun in Armenian. --Vahag 04:24, 8 December 2011 (UTC)
I can understand that this distinction doesn't matter much for some languages, but it certainly does for others like English. English proper nouns are almost never countable and can't be preceded by an article. Are there any other kinds of noun that are like that? —CodeCat 18:10, 8 December 2011 (UTC)
That's incorrect, Codecat. Is Egyptian a proper noun? It can certainly be counted, and is always capitalized. "The Louvre" is certainly a proper noun and is preceded by an article. Granted, besides demonyms and such things I can't think of any proper noun that can be preceded by an indefinite article. Grammatically speaking, proper nouns are just nouns that have special uses. — [Ric Laurent] — 18:53, 8 December 2011 (UTC)
Also there's many common nouns that aren't usually countable. — [Ric Laurent] — 18:59, 8 December 2011 (UTC)
I have always been against the distinction of noun and proper nouns, too. Many languages have virtually no proper nouns, and almost no language other than English observes this distinction. It has been my understanding that the main reason that proper nouns are distinguished in English is to explain capitalization. Monday in English is a proper noun, but lunes in Spanish is not. January in English is a proper noun, but январь in Russian is not. French is a proper noun in English, but français is not one in French. In many if not most non-Indo-European languages, there are almost no proper nouns except for names borrowed from other languages such as English. Yes, I have heard the argument that not all proper nouns are capitalized, but every example of an uncapitalized proper noun I've been given turned out to be a common noun. And that not all common nouns are uncapitalized, but all the examples of capitalized common nouns turn out to be the first word in a sentence, or part of a heading, or in fact a proper noun.
While it is true that proper nouns are said not to be countable, this is because of the feature of any proper noun that there is only one. But that rule is not a razor, it is only a clue. I think most proper nouns can be countable: there are many Steves, many Mondays, many Januaries; you have have multiple Sony's, two Germanies, multiple Africas, lots of Coca-Colas. So that rule is just a red herring. As Ric points out, many common nouns may be uncountable, such as mass nouns like water, food, fish.
I recently got into an argument with an anon about the proper noun Civil War and the Navajo word for it which is not a proper noun. He could not understand how a proper noun in Engish would not also be a proper noun in Navajo. In Navajo, almost the only proper nouns are names borrowed from English. But Navajo does not recognized nouns versus proper nouns...only native words versus words borrowed from English. —Stephen (Talk) 19:34, 8 December 2011 (UTC)
The capitalization rules are a modern innovation. It is thus a fallacy to think that "proper noun" originated as a concept to explain capitalization. Quite the reverse is true. In older English (older Modern English), nouns were capitalized far more frequently in random locations, and you can see this in many 17th and 18th century documents as a regular feature. Shakespeare's and Milton's capitalization would confuse any 21st century reader. And yet Locke wrote a lengthy discussion on the distinction between common and proper nouns in the midst of this, and did not rely on capitalization as any part of his argument. --EncycloPetey 04:05, 9 December 2011 (UTC)

I agree with what is explained. I can add that the tradition about what a proper noun is is slightly different in different languages (e.g. in French, language names are never considered as proper nouns). Nonetheless, I consider this is an important distinction, and it should be kept, at least for languages using it. I also would add Surname and First name in addition to Proper noun, because they are very special cases. Lmaltier 19:43, 8 December 2011 (UTC)

I also add that, in French, some common nouns are capitalized (demonyms). Lmaltier 19:56, 8 December 2011 (UTC)
The distinction of proper noun and common noun is often explained as capitalisation but that in itself can make no sense at times. For example, in Dutch, there is a rule saying that a noun referring to a people should be capitalised if they are culturally united, and not otherwise. So Inuit is capitalised but indiaan is not. It's very arbitrary, and I don't think these words are proper nouns at all. At the very least, I would say all names for individual entities are proper nouns, like England, Julie, Mars and such. But after that I don't really know... Atlantic Ocean doesn't really seem like a proper noun to me, because grammatically it can easily be divided into an adjective + noun phrase. What makes it special is only that there happens to be one Atlantic Ocean, which is why we label it the Atlantic Ocean. For England, we don't really have a need to say the England, because grammatically 'England' already implies that it is unique. Now, you could say I don't know a Julie or I don't know any Julies, and here it seems like the word is countable. But I would say that in this case, 'Julie' has become a metonym for 'a person named Julie', just like a Picasso is a painting and not the person himself. —CodeCat 21:29, 8 December 2011 (UTC)
The trouble is where to draw the line. If a word can be used "as a proper noun" and "as a common noun" (with such easily and automatically connectable senses as "this certain person called Julie" and "someone called Julie"), how do we determine what it "really is", i.e. the primary function, proper or common? Arguably, it's fairly straightforward with people's names (or is it?), and even easier for place names, but what about e.g. Monday? I can see that it's actually marked simply as a noun now in its Wiktionary entry, although Easter Monday is marked as a proper noun, although there is no difference between these except that the repeat cycle for a regular Monday is shorter. Which function is primary in these? Example for proper noun usage (actually more than one [sub]sense): "Monday is the second day in the week" (meaning the general concept of Monday within the context of the week, also as a general concept), "I'll be back by Monday" (meaning the next Monday specifically, compare "by 2012"); for common noun use: "the following Monday", "the first Monday in September", "two Mondays from now", "theres always a Monday after a Sunday" (all the same applies to Easter monday, Christmas Day, Christmas, etc.). This of course, also leads to the question of whether any of this matters for the header and categories of the word. The appropriate usage will usually be quite plain from the definition provided in the definition line, and can also be further illustrated by usage examples, quotations, and usage notes. I really don't see how "proper" in the header will offer any further clarification than the definition line. There are all kinds of other restrictions that semantics can impose on word alignment that we don't try (and fail) to clarify in the POS header, e.g. the difference between place names and personal names; consider these sentences: "I was in Montreal", "I was talking to Julie". They can't readily be reversed. They must be used in a specific way, but for semantic reasons. – Krun 00:22, 9 December 2011 (UTC)
The distinction between any two parts of speech in English is fuzzy. This does not mean that there are not useable categories that have grammatical differences; it simply means that language is messy and does not always follow the strictures of grammarians. In "The meek shall inherit the earth;" the adjective meek is used as a noun. In "copper coffee kettle," the nouns copper and coffee function like adjectives in giving attributes to the following noun. Latin grammarians made no distinction between adjectives and nouns, and this division into two separate parts of speech is a fairly recent innovation. Similar blurry lines are easily found between adverbs and prepositions, and between certain verb forms and adjectives or nouns. This does not mean that we should abandon these distinctions, simply because the lines distinguishing them are not 100% clearly made. The important question is whether a different header will mark a signifiacnt difference in lexical content or in usage. In fact, proper nouns in English function quite differently from common nouns, and that's one reason we have far more arguments discussions about inclusion of proper nouns en masse than we do for common nouns. It;s also why we have more heated debates about defining proper nouns. See User:EncycloPetey/English proper nouns for a start on a page explaining both the philosophical distinction as well as rough notes begun for characterizing the differences in grammatical usage. --EncycloPetey 04:05, 9 December 2011 (UTC)
I suppose my main question is whether the distinction is useful where we use it. I believe it isn't, and in fact gets in the way, complicating entry structure and moving debates from where they belong. I don't dispute the existence of proper nouns, but, as your think tank page makes clear as well, the lines are far from clear-cut, and individual lexemes (unique words) cannot all be cleanly put in one category or the other. Many (perhaps most) can be both, whether by metonymy (or another type of transparent derivation applying to all nouns otherwise of the one category), or by clearer sense distinctions (Albanian ("specific language" or "person from Albania"). Since we have a separate header for proper nouns (and the Noun header thereby implies "common noun", which I don't actually like either), we cannot treat both cases under one header or the other. This is true of Monday and Julie as much as Albanian. It would mean e.g. that, for Julie, there would also need to be a (common) noun section if we were truly to be consistent. I however, find such a treatment to be impractical; I want to treat both under the same definition line (recognizing the potential automatic metonymy as inherent), hence under the same header. But since this (secondary) sense makes it a common noun, I find it wrong to place it under the ===Proper noun=== header, and that is why I think a simple ===Noun=== header for all nouns, common or proper, would be better. This does not mean that any information is really lost; even without the "proper" in the header it is quite clear that London (referring to the capital of England) is a proper noun, that is if you actually know what a proper noun is, which you also must know to be able to understand the ===Proper noun=== header. I think it would be much more useful to read a grammatical treatise on proper nouns, including a discussion of all the fine points, borderline cases/disputes, philosophical and grammatical considerations, etc., than to rely on (sometimes arbitrary/inconsistent) labeling in entries. We really should have some good general grammatical appendixes. – Krun 11:41, 9 December 2011 (UTC)
In French, and in English too (I think), all proper nouns may be used as common nouns (the Winston Churchill of Germany, etc.) But this use does not make them common nouns, it's a figure of speech. Lmaltier 18:21, 9 December 2011 (UTC)
It is true, I think, that all proper nouns (in most Indo-European languages at least) can be used as you describe. And on the contrary, it does make them common nouns in that instance (while not generally), although it clearly derives from another (perhaps the primary) sense of the word which happens to be a proper noun sense. – Krun 18:51, 9 December 2011 (UTC)
To recap: I think the property of "proper" or "common" belongs to specific, narrow senses (semantic) and not to words, as a whole (lexical), and should therefore (and for the practical reasons stated above) rank below the true lexical properties (POS), just as the definition lines are under the POS header. – Krun 19:27, 9 December 2011 (UTC)
POS is not a lexical property; it is a semantic property. That's why our senses are grouped under POS headers. I don't understand the distinction you seem to be making between "senses" and "words". When you say "word", do you mean "particular spelling"? For example, we currently list eating as a verb form, adjective, and as a noun. All three derive from a single source and have similar lexical content; only the grammar distinguishes them. Do you think, therefore, we should eliminate the separation of those three headers because this is all one "word", and instead indicate adj., verb, noun individually for each sense? --EncycloPetey 16:34, 11 December 2011 (UTC)
  • It's not obvious to me what's being proposed here. If the proposal is to completely ignore the distinction between proper nouns and common nouns, then I oppose. If the proposal is simply to demote this distinction to an inflection-line-note or a sense label, say {{context|proper noun}} or something, then I'd be O.K. with that. This distinction seems pretty comparable to the distinction between countable nouns and uncountable ones, or that between transitive verbs and intransitive ones; for neither of those distinctions do we use separate POS headers. —RuakhTALK 20:52, 9 December 2011 (UTC)
The proposal is simply to merge the ===Proper noun=== headers into the ===Noun=== headers. I like your analogy with countability and transitivity and agree that this is a very similar case; in a context label for an individual sense is precicely where this information belongs. Of course, as with both countability and transitivity, the label may not always be needed; indeed I think in most cases it would be superfluous. Would we really put # {{context|proper noun}} {{given name|male}} or # {{context|proper noun}} The capital city of England? We could, of course, do that for all proper nouns, and although rather superfluous and a bit cluttering I would still like it better than the status quo. I would prefer, however, that such labels were only used when necessary for clarification of the sense in question or the difference between the various senses under the same header. – Krun 22:17, 9 December 2011 (UTC)
Krun just asked me to see and comment in this discussion. I've read mparts of it and skimmed therest, and agree with EP's post of 04:05, 9 December 2011 (UTC) and Ruakh's of 20:52, 9 December 2011 (UTC).​—msh210 (talk) 18:53, 11 December 2011 (UTC)
  • Given a vote - I would vote to merge - but I DONT want to spend hours discussing the subject! —Saltmarshtalk-συζήτηση 16:33, 11 December 2011 (UTC)

Merge the headers. Sorting senses into Noun and Proper Noun headings offers more problems than solutions, and needlessly complicates entries. Most or all professional dictionaries seem to be doing this already. Michael Z. 2011-12-11 18:49 z

I'd vote to merge the two headers. --Panda10 19:10, 11 December 2011 (UTC)

Merging and {{context|proper noun}} seems fine to me. ~ Robin 20:21, 11 December 2011 (UTC)

The way I see it, there is either a difference between proper nouns and regular nouns or there is not. If there is a difference, then I don't see why you would want to merge the two. However, if the argument is that a {{context}} label is more appropriate, then fine. However, that raises all sorts of formatting headaches of its own, unless the intention is simply to ignore that aspect (as is sometimes the case around here), leaving people like me to spend the next few years slowly cleaning up the mess by hand (yet again). In the case of Mandarin entries, a context label requires a lot of information in order for the category sorting to function properly. For example, if one were to modify a proper noun entry such as 曲阿 with a context label, the context label would look something like:
{{proper noun|lang=cmn|script=traditional|script2=simplified|skey=刀13|skey2=qu1e1}}
If the noun and proper noun labels are simply merged without doing the appropriate context labels in some automated fashion (not a trivial task, I'm sure), this would create a lot of extra busy work for me. I'm still cleaning up the mess that was caused by moving away from zh-tw and zh-cn labels in Mandarin. -- A-cai 23:27, 11 December 2011 (UTC)
P.S. Note that entry already has an archaic context label, which further complicates the issue. -- A-cai 23:29, 11 December 2011 (UTC)

It would be helpful to clarify the terminology, at least as it applies to English. The Cambridge Grammar of the English Language is, I feel, uncharacteristically vague about this point, but it makes some useful distinctions between proper names and proper nouns. "The central cases of proper names are expressions which have been conventionally adopted as the name of a particular entity -- or, in the case of plurals, like the Hebrides, a collection of entities" (p. 515). "Proper nouns, by contrast, are word-level units belonging to the category noun. Clinton and Zealand are proper nouns, but New Zealand is not. ... Proper nouns function as heads of proper names, but not all proper names have proper nouns as their head" (p. 516). Under this definition, I believe demonyms like Egyptian would not qualify as proper nouns, despite being capitalized, while language names like Egyptian would qualify.

I agree that common nouns and proper nouns are all types of noun, and belong under the same top-level heading. Mind you, I also think pronouns do too. In contrast, proper names are not types of nouns but rather are NPs or nominals.--Brett 12:53, 12 December 2011 (UTC) --Brett 12:53, 12 December 2011 (UTC)

A more confusing edge case: The Chinese have a splendid cuisine. Is Chinese a (singular) collective proper noun or a (plural) mass common noun? Does the cuisine belong to the one and only Chinese People, or by the subset of persons who live in China, or the subset of persons of Chinese ancestry? Even the speaker might not be making a distinction.

Disregarding such ambiguity for the moment, I think properness of nouns is an aspect of their usage, and not an inherent quality of a term. We can indicate that a term is usually a proper noun, or not. This may be self-evident from the definition itself (for example if the referent is clearly a specific entity), or from the use of a term like "specific." Perhaps a usage label is helpful in some cases. Or perhaps we should always use a label, for clarity and categorization. Michael Z. 2011-12-12 16:17 z

I have created a vote. Please continue the discussion on the vote talk page and comment on the proposal as you find necessary. – Krun 01:17, 13 December 2011 (UTC)
Good job writing this up there. Michael Z. 2011-12-13 17:22 z
Changing Proper noun to Noun is easy, but changing Noun back to Proper noun is difficult, fr.wikt experienced it: we introduced the distinction between common nouns and proper nouns, because it was considered really useful. The change took a long time, and many pages got a wrong POS for a while.
Again, it should be considered that all proper nouns may be used as common nouns, it's a figure of speech, and, in most cases, such a use does not deserve a mention here. In French, in such a case, the nouns are capitalized, because they are proper nouns, not true common nouns, even if used as common nouns. When the figure of speech becomes so common that they really become true common nouns, they usually lose the capital (e.g. hercule, bordeaux) (usually but not always e.g. une Granny Smith). I know that, in English, they always keep the capital and that this rule makes the distinction much more difficult, but it exists too. Lmaltier 09:07, 18 December 2011 (UTC)
I have just noticed that some editors have been marking language names in other languages such as Catalan (Category:ca:Languages) and Spanish (Category:es:Languages) as proper nouns. They are not proper nouns in those languages, they're just common nouns. I did not check other common nouns such as months, weekdays, or nationalities, or any other languages, but I suspect this problem is widespread and would require a lot of work to clean up...unless a bot could be made to do it. —Stephen (Talk) 13:33, 19 December 2011 (UTC)

[edit] RecentChangesCamp 2012

Just a reminder RecentChangesCamp 2012 is coming up soon! :D Please consider attending. :) It is a great opportunity to network with your fellow Wiktionary contributors. :) Invite all your wiki friends. :) You may be eligible to apply for a a WMF Participation grant or a WM AU grant if you're from Australia or New Zealand. If you're considering coming from over seas and you're female, you may also be interested in the ADA Camp, which could help better justify the last minute trip to Australia. :D We'd love to see you at RecentChangesCamp. :D --LauraHale 10:06, 8 December 2011 (UTC)

[edit] Frenzy about deleting SoP's

I'd like to draw your attention to attempts to delete grammar terms like nominative case, diseases like lung cancer, professions, etc. I don't even think they are a case for Category:English non-idiomatic translation targets. Our CFI has flaws if such important terms get deleted. --Anatoli (обсудить) 22:08, 8 December 2011 (UTC)

I totally agree. Matthias Buchmeier 13:41, 9 December 2011 (UTC)
Well 'attempt' seems to be the operative word; if they pass the deletion process, what's the harm anyway? I agree that lung cancer in particular doesn't meet CFI, yet it'll probably pass. I'd argue it the other way, there's a frenzy about keeping SoPs that don't meet CFI. I'd welcome ideas that would make such terms includable. NB I don't mean I'd support any such idea, just that I'd like to at least see it. Having said that, I think that CFI should reflect what good editors do, and if things like lung cancer, nominative case, accusative case (etc.) are consistently considered valid by the community,I think CFI should reflect that. Mglovesfun (talk) 16:04, 9 December 2011 (UTC)
What Mg said (except his first sentence. For example, nominating things in bad faith would be a bad thing (not that that's what's happening)).​—msh210 (talk) 18:27, 9 December 2011 (UTC)
If terms belong to the vocabulary of the language, either the general vocabulary or a technical vocabulary, they should be included, even when SOP (e.g. Atlantic salmon). CFI should be changed to reflect this principle. Lmaltier 18:32, 9 December 2011 (UTC)
A word of warning, RFD is very different from a vote on CFI as to avoid deletion, there only needs to be a no consensus, which depending on the closing admin, let's say about 40% opposition to the deletion. For a vote to pass on the matter, there needs to be 70% or more support. Which is why, irritatingly from my perspective, it's easier to vote keep outside of CFI on deletion debates than it is to change CFI to have these things meet CFI anyway. Mglovesfun (talk) 18:36, 9 December 2011 (UTC)
This is because our voting process is biased towards the status quo. The lesson here, paradoxically, is that if you want your entry to be kept, the best chance of success is if you just go ahead and disregard CFI and create it anyway. —CodeCat 19:01, 9 December 2011 (UTC)
I agree, but then voting can become a bit 'faddy', where a proposed idea can get a lot of support when it is proposed and later become quite unpopular. Mglovesfun (talk) 19:07, 9 December 2011 (UTC)
Almost all ordered processes are biased toward the status quo. That is what is required to maintain order. A process in an ordered system could be excessively or insufficiently oriented toward favoring the status quo or certain types of change.
There is no special reason to favor terms related to linguistics just because the community of active contributors is especially sensitive to the possible opaqueness or complexity of the accepted meaning of a term. I saw few defenders for the terms being added by an emergency medical services practitioner (apparent, anyway), which are likely to be more prevalent in a work community that does not normally commit its efforts to writing in scholarly journals and texts. I think it would be interesting to invite that contributor back to take a hard look at the linguistics and computing terms that we deem worthy of inclusion. This is already a very insular group. We don't need special defenses for our insular preferences. DCDuring TALK 20:06, 9 December 2011 (UTC)
He's still here (User:Luciferwildcat) and still adding stuff like HIV medicine. Equinox 20:09, 9 December 2011 (UTC)

[edit] Ypres demonym?

Is there a demonym for people from Ypres? Equinox 23:33, 9 December 2011 (UTC)

Yes: The wise Ypresians decided to do something and fast, lest the plague revisit them. (delayedreaction.blogs.com) Lmaltier 08:51, 10 December 2011 (UTC)
I could find no evidence of the use of Ypresian as a demonym. The OED [2ⁿᵈ ed., 1989] lists Ypresian, but only as a geologic adjective with the definition "Of, pertaining to, or designating the lowest stage of the Eocene in western Europe, lying above the Landenian. Also absol." Regrettably, however, I have no alternative to suggest, and can only suppose that a standard demonym exists formed on one of the Latin names for the city, viz. Ypra or Hyprae. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 15:21, 22 December 2011 (UTC)
I provide the evidence above. But, unsurprisingly, it's very rare in English (as a demonym), this is the only use I could find. Lmaltier 21:42, 23 December 2011 (UTC)
Oh, sorry; I overlooked the parenthesis. I think it's Yprois that Equinox was looking for; it's much more common. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 20:29, 24 December 2011 (UTC)
You are right, it's much more common (but very uncommon nonetheless). I would have imagined that it was a French word only. Lmaltier 07:46, 27 December 2011 (UTC)

[edit] Lojban

Please see Wiktionary:Votes/pl-2011-12/Banning Lojban entries. -- Liliana 21:03, 10 December 2011 (UTC)

[edit] Why does everything look Godawful?

Why does everything look Godawful? I mean the boxes. Equinox 23:22, 11 December 2011 (UTC)

Try retiling. No, seriously: what are you talking about?​—msh210 (talk) 00:32, 12 December 2011 (UTC)
That's the experimental thing that was being discussed above, to show only one language and eliminate the Contents from long pages and use tabs instead. Looks like an epic fail to me too. It displays no level 1 header at the top, which is confusing, and horizontally compresses all the content, which is very bad. Someone please make it go away. --EncycloPetey 00:50, 12 December 2011 (UTC)
That's strange, everything looks for me as it used to. I see all languages on one page as usual. —CodeCat 01:18, 12 December 2011 (UTC)
I agree with EP. The extra, poorly spaced hairlines all around the text are very distracting, the wasted horizontal space is bad, and a margin based on the length of the language name makes page layout completely inconsistent. Do not want. Michael Z. 2011-12-12 01:26 z
And I 'agree', if you will, with CodeCat. Sounds like I should be thankful. Or maybe I just need to clear my cache.​—msh210 (talk) 01:33, 12 December 2011 (UTC)
The latter.​—msh210 (talk) 01:37, 12 December 2011 (UTC)
Also, now there's on way to compare two different languages entries for a word. Michael Z. 2011-12-12 01:31 z
Oh, and hen there's a Translingual entry, that's the default (not English). &P --EncycloPetey 03:58, 12 December 2011 (UTC)
I agree with EP. The layout which I just disabled via my preferences after having experienced it for several hours, was appalling. The uſer hight Bogorm converſation 13:07, 12 December 2011 (UTC)
I've been away for a day and now everything looks godawful to me as well. How to I get back to the old look? SemperBlotto 08:05, 12 December 2011 (UTC)
Go to Special:Preferences#mw-prefsection-gadgets and uncheck "Enable Tabbed Languages. (Admin-only trial.)", second from the bottom of the "User interface gadgets" section. --Yair rand 08:07, 12 December 2011 (UTC)
I have two observations, one expanding on EP's point about Translingual's priority over English and one about the lack of a ToC to navigate long entries.
  1. At the very least, we should be able to select whether the Translingual or the English section merits priority placement, perhaps by placement of a template. In addition to the programming effort, this would require that we establish some criterion for determining whether the Translingual or English section for an entry merited priority placement. I would also see risk in creating such a template, for it could be used to force the default display of a specific language for any entry in which it were placed.
  2. For long English (or other language) sections, the lack of a ToC for the section makes navigation slower for experienced users and makes the scope of the entry less clear. I understand that it is not desirable to have both the right hand side ToC and the left-hand side tabs appearing at the same time. I don't know which of the two is more important from the point of view of the assumed target default user. Judging from the choices being made and accepted by our active contributors, that target default user seems to not be a native English monolingual user. DCDuring TALK 11:52, 12 December 2011 (UTC)
Good idea (DCDuring, allowing the community to choose whether English or translingual should show by default). If that's not technically possible, and the community wants these tabs anyway, we'd have to decide which we want as default (English or translingual). The missing TOC is also a problem, but it'd be unwieldy with the tabs: so how about a minimal TOC (just L3, perhaps, or, in cases of multiple etymologies, L4s also) that ties in visually to the language tabs?​—msh210 (talk) 18:00, 12 December 2011 (UTC)
I don't know if what I'm seeing looks different from what the others see, since I still use the Monobook skin, but I think it looks pretty good. I could understand someone saying he kind of prefers that one over this one, or this one over that, but "godawful"? I like it. —Stephen (Talk) 12:17, 12 December 2011 (UTC)
I like the new look too, and I'm using Vector. The number of times I want to compare two languages' entries for the same entry is miniscule compared to the number of times I don't want to. —Angr 13:26, 12 December 2011 (UTC)
I think change is always a shock, a bit like when you move into a new house and the decor is awful, you soon forget about it and go on with your life. Mglovesfun (talk) 13:52, 12 December 2011 (UTC)
Admittedly not so good for me, as I like to format all the language sections of a page, so I know I have to click on every section one at a time to look for errors. But yes, I like it. Mglovesfun (talk) 14:01, 12 December 2011 (UTC)
If you have "Edit pages on double click" ticked under Editing Preferences, you can double click and open the entire page, including all languages. —Stephen (Talk) 14:06, 12 December 2011 (UTC)

Why does this option have to abandon all the good features of the page index? Make the standard page index float at the top-right, only show sub-section links for the currently-selected language. You could even put a persistent "show all" checkbox at the bottom of the index, letting one easily toggle the behaviour instead of hiding away the option. Michael Z. 2011-12-12 16:27 z

Damn, good point re 'show all'. Mglovesfun (talk) 16:30, 12 December 2011 (UTC)
Hm. Maybe better as a tiny triangle widget or something else minimal, to avoid adding page clutter. Michael Z. 2011-12-12 16:31 z
  • The one feature that seems just wrong about this is that, as a default, this format expends a substantial portion of screen space on a column for the language tabs. For much of its length, the column is simply white space for longer (usually English) language sections. I have spent a great deal of time trying to increase the amount of useful content that appears on the landing page. This seems to take a large amount away in one fell swoop.
    Very few other websites that I have seen expend such prime real estate on navigation. Web pages have largely standardized on other placement of navigation, usually across the top in small type. If there is potentially an excessive proliferation of possible destinations, the widely accepted practice is sub-menus. I suppose it would require making some decisions about which languages merited inclusion in the top line of languages in the event that there are more than, say, 10 language sections. Perhaps none of our sacred slogans and principles offer us any basis for making such decisions.
    The hairline border doesn't make any visual sense. It is, at best, a bit of extravagant eye candy that wastes even more horizontal screen space.
    In any event, we depart from accepted Web practice at our peril. DCDuring TALK 20:11, 12 December 2011 (UTC)
    I think it would be a good idea to place the tabs horizontally in the place where the page name displays now. Since we use headword lines, we don't actually need the page name to be there anyway... —CodeCat 22:47, 12 December 2011 (UTC)
  • Note: It is possible that some users are seeing tabbed languages without the styles applied, because their browser is still loading the old version of the CSS. If the language tabs on the left look like just unstyled links, you need to purge your cache for it to show up correctly. --Yair rand 23:03, 12 December 2011 (UTC)

I quite like this new presentation. It has a strength in that it deals quite effectively with overlong entries. It also has weaknesses, one of the severest, IMO, being that it makes comparing homographic terms between languages more difficult. One solution to that is the inclusion of a show all / hide all toggle atop the column of language tabs; would that be possible? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 00:09, 13 December 2011 (UTC)

Good idea.​—msh210 (talk) 18:34, 13 December 2011 (UTC)

Now when I try to look at another language, I lose the entire entry, can see nothing, and can not even reopen the English version. bd2412 T 22:09, 13 December 2011 (UTC)

You mean when you click on a language tab? What browser/OS are you using? Do you have any gadgets or prefs enabled? Did it work right at some point before? --Yair rand 22:15, 13 December 2011 (UTC)
I've tried it with Safari, Chrome and Firefox - all with an identical look. BUT could the language names on the tabs be made smaller to allow more horizontal space for the substance of an entry. The more I use it the more I like it —Saltmarshtalk-συζήτηση 15:35, 14 December 2011 (UTC)
Using IE. The problem still occurs. I go to a page and it looks normal (more or less); when I click on any language tab, the interior portion of the page (not the sidebars or top bar) goes blank. bd2412 T 18:32, 14 December 2011 (UTC)
What version of IE? Do the tabs also disappear? --Yair rand 22:06, 14 December 2011 (UTC)
The version is IE7. The tabs do not disappear, but clicking them does nothing. bd2412 T 16:26, 15 December 2011 (UTC)
Does it work now? --Yair rand 22:04, 15 December 2011 (UTC)
No. You have to click the tab and then refresh the page. It works normally in IE8 however. —Internoob 04:08, 18 December 2011 (UTC)
Just making sure: You're talking about actual IE7, not IE8/IE9's "compatibility mode" IE7 simulator, right? (Compatibility mode gives a false positive for "onhashchange" in window, which I just added a workaround for.) --Yair rand 02:06, 4 January 2012 (UTC)
I was actually talking about the compatibility mode, sorry. I was under the impression that it was the same as IE7. —Internoob 22:44, 10 January 2012 (UTC)

It doesn't work well on I, for example: the fact-file box pushes everything else down. This, that and the other (talk) 05:52, 15 December 2011 (UTC)

That's because the fact-file box is in the wrong place: it should be in the ==Translingual== section. (The problem is less obvious when not using Tabbed languages, but it's a problem nonetheless.) If we do turn on Tabbed languages by default, such problems will certainly be fixed. —RuakhTALK 04:57, 4 January 2012 (UTC)

[edit] Conference proceedings for eLEX2011: electronic lexicography in the 21st century

The proceedings are available here.--Brett 12:13, 12 December 2011 (UTC)

Of particular interest here might be "What Makes a Good Online Dictionary? – Empirical Insights from an Interdisciplinary Research Project"
Abstract
"This paper presents empirical findings from two online surveys on the use of online dictionaries, in which more than 1,000 participants took part. The aim of these studies was to clarify general questions of online dictionary use (e.g. which electronic devices are used for online dictionaries or different types of usage situations) and to identify different demands regarding the use of online dictionaries. We will present some important results of this ongoing research project by focusing on the latter. Our analyses show that neither knowledge of the participants' (scientific or academic) background, nor the language version of the online survey (German vs. English) allow any significant conclusions to be drawn about the participant's individual user demands. Subgroup analyses only reveal noteworthy differences when the groups are clustered statistically. Taken together, our findings shed light on the general lexicographical request both for the development of a user-adaptive interface and the incorporation of multimedia elements to make online dictionaries more user-friendly and innovative. "--Brett 14:07, 12 December 2011 (UTC)
That is an interesting paper. Unfortunately, while they've found some results, they readily acknowledge that there's a confounder they have yet to identify. Seemingly, though, the following rank is from most to least important for users of online dictionaries: reliability of content, clarity, up-to-dateness of content, speed, long-term accessibility, links to the corpus, and, approximately tied for last, links to other dictionaries, adaptability, suggestions for further browing, and multimedia content. There's a fairly big drop between accessibility and links to the corpus, except for a definite cadre of users that ranked corpus links above everything but reliability. (As I said, they need to find the confounder.)​—msh210 (talk) 17:18, 14 December 2011 (UTC)

[edit] Japanese transliteration - traditional or revised Hepburn transliteration?

Inviting Japanese editors here. We have a bit of an argument with User:Ryulong about the romanisation method, see Wiktionary_talk:About_Japanese/Transliteration#Hepburn romanization of long "i". The issue is around "aa" vs "ā" (in words like お母さん) and "ee" vs "ē" (in words like お姉さん). The revised method doesn't use macrons in this case, the traditional does. Your input is appreciated. --Anatoli (обсудить) 13:02, 12 December 2011 (UTC)

Eirikr and I were discussing this earlier at his talk page, and to quote what he said:
My own sense is that all doubled two-morae vowel sounds ああ, いい, うう, ええ, and おお / おう should be romanized as a single vowel letter with a macron, provided that the second mora is not (1) part of the next word element / kanji character, nor (2) part of a declining stem, such as the end of -i adjectives or the -u in vowels. Examples:
  • あたらしい: atarashii - the second i changes during conjugation
  • とりい: torii - the second i is part of the next kanji character
  • ちいさい: chīsai
  • おおきい: ōkii - o + macron for the long "o", but for the long "i" here, no macron since the second i changes during conjugation
  • どう: dō
  • 問う (とう): tou - the u changes during conjugation
  • おねえさん: onēsan
  • けいかく: keikaku
I tend to agree with him. In any case I would prefer if there would first be discussion and following discussion there be revisions. Haplology 13:52, 12 December 2011 (UTC)
I also don't see the sense in using ū but not ā, etc. The rule should apply to all the vowels, except on morpheme boundaries in compounds, before inflectional endings, etc. I would understand, however, a system which more closely resembles the kana spelling, such that because う=u means that うう=uu, whereas ū, ā, etc. could mean ウー, アー, etc., and in that system おう would always be ou, and おお oo. The one thing that's really strange is えい. It is sounded as a long e (=ええ, エー), so why is that not romanized ē if おう is romanized ō? – Krun 19:24, 12 December 2011 (UTC)
I have been romanising my translations as Haplology and I totally agree with this method. Yet, Ryulong has already changed the page - Wiktionary:About_Japanese/Transliteration, which I haven't reverted - notable examples are - okaasan and neesan. His point is that it is the revised Hepburn romanisation. One way or the other, we should discuss it first. --Anatoli (обсудить) 19:38, 12 December 2011 (UTC)
The traditional form of Hepburn is extremely out of date. It was only used in the first edition of Hepburn's Japanese-English dictionaries (in 1867), and all subsequent editions have eliminated the "use a macron for every long vowel" form that makes up the traditional form since 1872. The "aa", "ee", and "ii" morphemes (in words of Chinese origin) have not been identified as ā, ē, and ī for nearly 140 years. The only vowels that get the macron, aside from when ー is involved, are ō and ū, and in the rare instances where this comes up on this project (obāsan, okāsan, ā, ē, and onēsan) it was a simple change that I made to fix this glaring error in Japanese phonology and academics.
However, I would not oppose using a form of romanization that differentiates between the ō in tōi (tooi) and the ō in Tōkyō (Toukyou).—Ryūlóng (竜龍) 23:55, 12 December 2011 (UTC)
Romanising Tōkyō is too common to replace it with Toukyou, it's used quite often in Japan on signs, like other macrons, it's also the alternative English spelling. I don't think changing obāsan, okāsan to obaasan, okaasan and onēsan to oneesan is fixing a glaring error but thank you for fixing others. There is a general agreement to use "ii" except for chōonpu cases. --Anatoli (обсудить) 00:10, 13 December 2011 (UTC)
The Romanization guide of the MEXT is vague and it is not clear what is a long vowel. I totally agree with Eirikr in his way of Romanization cited above by Haplology. お母さん, お兄さん, お姉さん, and お父さん should be okāsan, onīsan, onēsan, and otōsan respectively. Just for your interest, there are a few sino-japanese words with a long i, such as 詩歌 shīka and 弑逆 shīgyaku. And remember, ordinary Japanese don't know Romanization of Japanese well. They are only accustomed to Japanese input methods using Roman letters. Don't believe them when they say 東京 is toukyou. We see ii very commonly such as in Niigata, perhaps because of the difficulty of distinguishing i and ī visually, but that doesn't mean it is a correct usage. Japanese don't care much about such differences, because Romanized Japanese is not an official writing anyway. — TAKASUGI Shinji (talk) 02:54, 13 December 2011 (UTC)
But ā, ī, and ē are no longer used, and have not been used since 1870, to transliterate the long a, i, and e as per the Hepburn dictionaries. Why should Wiktionary use a format that is heavily outdated and is not in use by any other Wikimedia project or any academics of the Japanese language for over one hundred years? I can show you the dictionary, and cite specific pages where "aa" (1, 468, 475), "ee" (82, 448), and "ii" (40, 53, 190, 191, 192, 193, 452, 453) are in use, and you will not find ā, ē, or ī in this book, because it doesn't include any wasei eigo. The only time that any of those vowels are indicated by macrons is if the ー is in use. The only modern publication that even comes close to using this format is the American National Standards Institute, but I do not think this was ever implimented and even then they omit I from the vowels that get macrons in native words.—Ryūlóng (竜龍) 09:42, 13 December 2011 (UTC)
Now I understand where your misconception comes from. The Hepburn romanization is not necessarily the same as what James Curtis Hepburn used. It doesn't matter whether he used aa or not. It is just the name of a way of romanization of Japanese based on some of his works. — TAKASUGI Shinji (talk) 04:01, 14 December 2011 (UTC)
We're having a bit of a confusion over at the English Wikipedia page as to what is and is not considered the romanization scheme. We have conflicting sources on the aa/ā issue in particular that we have been trying to iron out. The English Wikipedia, at least, has been using the form in the subsequent editions of Hepburn's dictionary, rather than what is apparently a form derived from his and is defined as the "modified" form there.—Ryūlóng (竜龍) 12:00, 14 December 2011 (UTC)
We still have to address the changes in the rules Wiktionary:About_Japanese/Transliteration, which Ryulong has made. I haven't reverted his changes, so that you have a chance to see what's acceptable or not, specifically where macrons or doubling of vowels is used. --Anatoli (обсудить) 05:06, 14 December 2011 (UTC)
Regardless of what the discussion is here, I think it is better to include the kanji when describing the various long-vowel rules, thereby showing that 格子 and 子牛 are transliterated differently, even though both are こうし.—Ryūlóng (竜龍) 12:00, 14 December 2011 (UTC)
In the word 子牛 and うし belong to different roots, so "koushi" is the correct romanisation. 格子 doesn't exist yet but per our convention, it should be romanised as kōshi. --Anatoli (обсудить) 23:00, 14 December 2011 (UTC)
While that is fine, my issue is still with how Wiktionary deals with the long vowels a, e, and i in words of Chinese or Japanese origin, particularly as the Hepburn dictionary does not use the ā, ē, and ī forms after 1872. I am told the Kenkyusha dictionary, which uses a similar system treats the vowels differently from the original Hepburn dictionary, but I do not have access to this because it is not in the public domain.—Ryūlóng (竜龍) 09:48, 16 December 2011 (UTC)
(Chiming in a bit late here, was busy then sick, but I'm better now and have some time.)
Pulling in some more text from my Talk page, clarifying comments added in square brackets:

For Vowels: Judging from all the kerfuffle over at w:Talk:Hepburn_romanization, it's pretty clear that the terminology used to describe the different Hepburn variations is itself far from clear. Then reading the w:Hepburn_romanization page [particularly at w:Hepburn_romanization#Long_vowels], it's clear that Hepburn himself was a bit confused -- long "e" in his first romanization system, Traditional Hepburn, was variously romanized as e or ē, which just seems sloppy, and for some reason long "i" alone of the vowels was never romanized as ī, which seems inconsistent with the other vowels.

The main WP article doesn't mention it, but the Talk page does describe use of i + macron in the Revised (or was that Modified? Or Revised Modified? Or Modified Revised?) Hepburn system, albeit with the only examples given of borrowed katakana words with the 長音符 [chōonpu: "long sound mark", i.e. the ー mark].

If I understand you correctly, Ryulong, it sounds like you advocate using macrons only for the long "o" spellings おお, おう, or オー, and the long "u" spellings うう or ウー, and not using macrons for any of the long "e", "a", and "i" spellings. What is your reasoning for this? Is it purely borne of a desire to match what the Hepburn dictionary uses / has used? I confess I do not understand your position. w:Hepburn_romanization#Long_vowels illustrates that modified Hepburn uses macrons for all long vowel spellings other than "i", excluding cases where the second mora belongs to a separate morpheme. Is it your position that modified Hepburn is incorrect somehow? If so, could you explain?
One thing that bothers me about modified Hepburn is the inconsistency regarding long "i": if we are to use macrons to indicate long vowels, then why not for the long "i"? The morpheme boundary distinction makes sense to me, so いい ("good") would always be transcribed as ii, since the second "i" mora is the declining adjectival ending and thus part of a different morpheme. However, for 新潟, the two "i" morae are part of a single morpheme, and thus the more consistent transcription would be Nīgata, using the macron to indicate the long "i". Essentially, this is my only suggested departure from modified Hepburn, as described at w:Hepburn_romanization#Long_vowels.
I look forward to further discussion. -- Eiríkr ÚtlendiTala við mig 18:07, 16 December 2011 (UTC)
A quick PS -- the 7th edition of Hepburn's dictionary found on Google Books at http://books.google.com/books?id=vKAPAAAAYAAJ&printsec=frontcover&source=gbs_atb#v=onepage&q&f=false is from 1903. I think it's safe to say that ideas about romanization may have changed somewhat in the intervening century and some. It's clear too that the Japanese language itself has changed; Hepburn gives okkasan as the word for "mother" rather than the modern Japanese okāsan. -- Eiríkr ÚtlendiTala við mig 18:26, 16 December 2011 (UTC)
My main issue is that I don't have a copy of the Kenkyusha to determine what has happened since 1903. I only have documents such as this document from 1972 from the w:American National Standards Institute. And on why I promote the long o and long u only is because this is what I have become accustomed to on the English Wikipedia, at least until another editor there discovered that the page had been in error for five years.—Ryūlóng (竜龍) 20:57, 16 December 2011 (UTC)
Thank you for the explanation, Ryūlóng. That helps me see a bit better where you're coming from.
Looking at the ANSI document, I note tha0t this advocates using the macron for long single-morpheme vowels, such as ああ ā and ねえさん nēsan given as examples under Table 2. (The document also has some internal consistencies, such as romanizing し as shi in some places and as si in others, but given the flow of the text, my impression is that si is a leftover from an earlier version that was missed during editing.)
I just checked the [kod.kenkyusha.co.jp/ Kenkyusha Online Dictionary] from my work account (login required), and it seems they don't use either traditional or modified Hepburn as described on the WP pages, but rather use their own romanization scheme that seems to be based on traditional Hepburn. Examples from their 新和英大辞典(第 5 版) ("New JA-EN Big Dictionary, 5th Edition", published July 2003, and the only dictionary on the site that uses rōmaji) of the differences I've found from modified Hepburn as described at w:Hepburn_romanization#Long_vowels, and from other general best practice:
  • Romanization based on traditional Hepburn, i.e. not using macrons for long "a" or long "e" in native Japanese words, though resolving Hepburn's possible confusion where long "e" was inconsistently accounted for:
  • No spaces between independent morphemes, no capitalization of proper nouns:
    • 新潟県中越大震災 (にいがたけんちゅうえつだいしんさい): niigatakenchūetsudaishinsai -- A more ideal romanization even just using traditional Hepburn might be Niigata-ken Chūetsu Dai Shinsai.
  • Non-Hepburn handling of long "o", where おお is marked as "oo" and おう is marked as "ō":
    • 大海烏 (おおうみがらす): ooumigarasu
    • 大奥 (おおおく): oooku
    • 通る (とおる): tooru
      -- but:
    • 観光 (かんこう): kankō
    • 放る (ほうる): hōru
Poking around the site, I can find nothing that describes or explains their romanization system. I presume that such is probably included in the introductory pages of the dead-tree version, but I do not have access to that.
For my part, Kenkyusha's romanization scheme strikes me as more arbitrary and less consistent than I am comfortable with. I find that I again come back to the opinion that single-morpheme long vowels should be indicated with the macron instead of using doubled vowels, as a slight change to modified Hepburn. (Incidentally, this appears to be more consistent with spelling conventions for other languages that use long vowels, such as Māori or Latin.)
Is there a particular reason to hew to Kenkyusha's style that you could articulate? -- Eiríkr ÚtlendiTala við mig 23:22, 16 December 2011 (UTC)
I actually find the arbitrary choices that have been made in the 5th edition to be exactly what I'm looking for in a romanization scheme, except for the spacing and capitalization issues.
I've recently come across some Library of Congress documents that say they use the "Modified" Hepburn scheme utilized by the Kenkyusha's 3rd edition (1983) and they use ā for ああ in "ああしたい". However, this other ALA-LC document (1997) Library of Congress publication uses "aa" for the same phrase.—Ryūlóng (竜龍) 04:57, 17 December 2011 (UTC)

Looking through Special:UnusedCategories, the bot BabelAutocreate has created a lot that we'd consider non-standard, such as this one. Category:User fiu-vro-1 is another one (should be Category:User vro-1). Can we delete these, do they require an RFDO or not? Mglovesfun (talk) 13:55, 12 December 2011 (UTC)

Speedily delete them IMO. That's what we do with mistaken bot-created entries (e.g., Tbot).​—msh210 (talk) 18:03, 12 December 2011 (UTC)

[edit] Default tabbed non-English language

Does the new tabbed languages view have some sort of hierarchy of languages for the sake of deciding which language gets displayed as default? Above people have discussed the problem of Translingual appearing rather than English, but at adobar I notice that Spanish appears rather than Catalan, despite Catalan being earlier in the alphabet. Is this intentional? —Angr 17:56, 12 December 2011 (UTC)

Catalan appears for me.​—msh210 (talk) 18:01, 12 December 2011 (UTC)
I would like this to be selectable as a preference. I often work in one language specifically, and it would be useful if that language could be set as the default while I'm doing that. —CodeCat 19:26, 12 December 2011 (UTC)
The hierarchy goes like this, in order of priority:
  1. If the user arrives at an entry being directed to a specific section (from a link like [[foo#Middle English]] or {{l|enm|foo}}) it goes to that section, otherwise:
  2. If there's a translingual section it goes to that.
  3. If there's an English section it goes to that.
  4. If the language of the most recently viewed non-English/Translingual section is available, it goes to that.
  5. If there's a language targeted through targeted translations it goes to that.
  6. Otherwise, it goes to the first language on the page.
--Yair rand 22:08, 12 December 2011 (UTC)
What do you mean by "the most recently viewed non-English/Translingual section": most recently viewed in any entry?​—msh210 (talk) 18:36, 13 December 2011 (UTC)
Yes. --Yair rand 20:33, 13 December 2011 (UTC)

[edit] Tabbed languages and the use of Template:l

Now that tabbed languages has been enabled, it works pretty well, even if there is some room for improvement. One problem I see is that a lot of languages other than English use plain links to link to words in that language. With tabbed languages, it will link to the English section instead, and this may confuse users. We already have a template that can link to a specific language section, which would be {{l|nl|wat}} to link to wat#Dutch. But this template still isn't used in most articles, which could create a usability issue if tabbed languages is enabled for all users. Do you think we should make this template required instead of optional? And can a bot be used to fix any links? —CodeCat 19:23, 12 December 2011 (UTC)

I think {{l}} should definitely be used everywhere a plain link is used, though a [[word#Lang|word]] might be used in running text if editors prefer. – Krun 20:17, 12 December 2011 (UTC)
This shouldn't come up in running text, because then it's either an English word (where #English is not really needed), or else it's a mention (and should be using {{term}}). The only exception I can think of is quotations, where we occasionally linkify words for various reasons. —RuakhTALK 20:25, 12 December 2011 (UTC)
So the assumption then is that a bare link is supposed to link to the English section, and when the word in question is not English, it's an error? I imagine this would be mostly in lists of terms like derived terms, see also, alternative forms and so on. On the other hand, for consistency it would be better to use {{l}} all the time, even for English, because it makes it much easier for a bot like AutoFormat to spot the mistake. —CodeCat 20:36, 12 December 2011 (UTC)
Oh, I agree that we should always use {{onym}} or {{l}} in lists of terms. But in running text we frequently linkify English words — for example, one sense-line at [[lake]] links to [[water]] — and I don't think those would benefit from {{l}}. —RuakhTALK 21:07, 12 December 2011 (UTC)
By running text, I mostly mean the definitions themselves, and usage examples. – Krun 01:30, 13 December 2011 (UTC)
But definitions are in English, and usage examples aren't linkified. —RuakhTALK 02:48, 13 December 2011 (UTC)

First we build a website where different-language terms are on the same page, now we're adding a complicated interface that lets us pretend they're not on the same page. Doing this will break the most fundamental wikitext construction, the link.

Please rethink this whole undertaking. If there's a problem with multiple language terms appearing on the same page, then why don't we put them on different pages? Michael Z. 2011-12-12 20:33 z

I agree with you completely and I think it would be much better if every language had its own page. But it would take a lot of work to make that work right, and most people like to keep things as they are... —CodeCat 20:36, 12 December 2011 (UTC)
I'm not sure we agree. I'm implying that this project will make Wiktionary worse. Michael Z. 2011-12-12 20:51 z
In that case you might want to take a page from lawyers, by not asking rhetorical questions when you don't know what answers you'll get. :-P   —RuakhTALK 21:03, 12 December 2011 (UTC)
In any case, using separate pages for different languages would only make the need for {{l}} even more obvious. —CodeCat 21:31, 12 December 2011 (UTC)
If you move the contents of café#French to fr/café, slap a navbar of tabs on the top of the English entry at café, then you'll have what you want, without any javascript. It goes without saying that we'd have to agree on a major change to the whole project before this could happen. Forgive me for not also mentioning that creating a separate website interface, based on a completely different mental model of pages and links, for some portion of editors, would have drawbacks. Michael Z. 2011-12-12 21:43 z
I just want to make it clear that I would like {{l}} (or optional #Language in the link) to be standard for English words as well, since, when an English word is being linked to, it would be desirable to skip long TOCs (under the non-tabbed system) and the Translingual section (under both systems), since Translingual sections come before English ones. Also, such explicit linking would be more machine readable (e.g. for import of Wiktionary into projects with a database structure that splits up entries by language). – Krun 22:59, 12 December 2011 (UTC)
FWIW, I already use {{l}} in every case (partially to skip unnecessary TOCs) and I'd support requiring it as standard. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 00:04, 13 December 2011 (UTC)
So the basic link to [[term]] in an entry becomes obsolete, and we have to use {{l|en|term}} every time? You're breaking basic wikitext linking for all editors, for the benefit of a cobbled-together interface that's optional, and presumably will only be used by a minority of editors. No, thank you. Michael Z. 2011-12-13 00:24 z
Trading ease of viewing for ease of editing doesn't seem like a good alternative, though. Using {{l|en|term}} makes Wiktionary more usable because it brings users closer to what they want to see. —CodeCat 00:36, 13 December 2011 (UTC)
Yup. I find it very frustrating when I click a link (a word in the definition, synonym, etc.) and subsequently find myself at the top of a big page, having to scroll down, often through multiple language sections. This often happens with Icelandic, but also quite a bit with the shorter words in English, where there are many language sections and the TOC is very big, or in entries with a Translingual section. – Krun 01:30, 13 December 2011 (UTC)
Shall we have a straw poll to gauge support for such a standard requirement? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 11:55, 13 December 2011 (UTC)

[edit] Straw poll

Please sign below with {{subst:support}}. Sign as many as apply to you.

Please don't sign until 22:00 on 14 December (UTC), so as to allow for editing of the options until that time. Do edit the options until that time, heavily.

NOTE: All of these options are excluding from the discussion links that are generated by other templates (like {{term}}).

My support below is for {{l}}, not [[term#Language|term]].
  1. Symbol support vote.svg SupportCodeCat 11:54, 16 December 2011 (UTC)
  2. Symbol support vote.svg Support (though I actually use {{onym}} rather than {{l}}; same difference). But the only reason I'm voting this way here is that I'm only voting below on links in lists, which are mentions (though not italicized), so special markup makes sense. For links that are uses, such as in "A large cat", I think [[word#English|word]] is preferable; it's just that I'm also fine with plain [[word]] for that case, so that preference doesn't show up in my votes. —RuakhTALK 18:56, 18 December 2011 (UTC)
  3. Symbol support vote.svg Support — Since {{l}} specifies the proper script automatically. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 02:36, 22 December 2011 (UTC)
My support below is for {{l}} or [[term#Language|term]], as individual editors decide.
  1. Symbol support vote.svg SupportInternoob 00:37, 17 December 2011 (UTC)
  2. Symbol support vote.svg Support —Stephen (Talk) 13:22, 19 December 2011 (UTC)
My support below is for [[term#Language|term]], not {{l}}.
  1. Symbol support vote.svg Support.​—msh210 (talk) 06:00, 16 December 2011 (UTC)
However, if people prefer another of the above options, I still maintain my support below.
  1. Symbol support vote.svg Support.​—msh210 (talk) 06:00, 16 December 2011 (UTC)
  2. Symbol support vote.svg SupportInternoob 00:41, 17 December 2011 (UTC)
  3. Symbol support vote.svg Support — Raifʻhār Doremítzwr ~ (U · T · C) ~ 02:36, 22 December 2011 (UTC)
I support the use of language-targeted linking to generate all links in definitions to English entries if Tabbed Languages is enabled by default for all users.
  • Symbol support vote.svg Support, for links in non-English entries only. No. If I understand the hierarchy of rules correctly, a plain [[link]] will to an English or Translingual section (if present), so no need for targeted linking.​—msh210 (talk) 17:26, 18 December 2011 (UTC)
  1. Symbol support vote.svg SupportCodeCat 11:54, 16 December 2011 (UTC)
  2. Symbol support vote.svg Support — Raifʻhār Doremítzwr ~ (U · T · C) ~ 02:36, 22 December 2011 (UTC)
I support the use of language-targeted linking to generate all links in definitions to English entries if Tabbed Languages is disabled by default for all users.
  1. Symbol support vote.svg SupportCodeCat 11:54, 16 December 2011 (UTC)
  2. Symbol support vote.svg Support — Raifʻhār Doremítzwr ~ (U · T · C) ~ 02:36, 22 December 2011 (UTC)
I support the use of language-targeted linking to generate all links in lists to English entries if Tabbed Languages is enabled by default for all users.
  1. Symbol support vote.svg SupportCodeCat 11:54, 16 December 2011 (UTC)
  2. Symbol support vote.svg Support.RuakhTALK 18:56, 18 December 2011 (UTC)
  3. Symbol support vote.svg Support — Raifʻhār Doremítzwr ~ (U · T · C) ~ 02:36, 22 December 2011 (UTC)
I support the use of language-targeted linking to generate all links in lists to non-English entries if Tabbed Languages is enabled by default for all users.
  1. Symbol support vote.svg SupportCodeCat 11:54, 16 December 2011 (UTC)
  2. Symbol support vote.svg SupportInternoob 00:40, 17 December 2011 (UTC)
  3. Symbol support vote.svg Support for Latin-script targets, because a [[plain link]] goes (if I understand the hierarchy of the script's rules correctly) to an English or translingual section if present. I don't support this for other-script targets, since (again if I understand correctly) then target will be the section linked from (unless targeted translations is in force).​—msh210 (talk) 17:26, 18 December 2011 (UTC)
  4. Symbol support vote.svg Support. Unlike msh210, I especially support it for non–Latin-script entries, because then we want the appropriate script template to be used (e.g., {{onym|he|...}} invokes {{Hebr}}). But even for Latin-script entries, we want the lang=fr in our generated HTML. (Plus what msh210 says.) —RuakhTALK 18:56, 18 December 2011 (UTC)
    Good point about script templates. I use {{Hebr|[[foo]]}} myself, but I suppose {{l}} is easier to use (if more burdensome on the servers).​—msh210 (talk) 19:41, 18 December 2011 (UTC)
  5. Symbol support vote.svg Support — Raifʻhār Doremítzwr ~ (U · T · C) ~ 02:36, 22 December 2011 (UTC)
I support the use of language-targeted linking to generate all links in lists to English entries if Tabbed Languages is disabled by default for all users.
  1. Symbol support vote.svg SupportCodeCat 11:54, 16 December 2011 (UTC)
  2. Symbol support vote.svg Weak support. I generally do this, but it doesn't bother me if other people don't. —RuakhTALK 18:56, 18 December 2011 (UTC)
  3. Symbol support vote.svg Support — Raifʻhār Doremítzwr ~ (U · T · C) ~ 02:36, 22 December 2011 (UTC)
I support the use of language-targeted linking to generate all links in lists to non-English entries if Tabbed Languages is disabled by default for all users.
  1. Symbol support vote.svg SupportCodeCat 11:54, 16 December 2011 (UTC)
  2. Symbol support vote.svg Support. I especially support it for non–Latin-script entries, because then we want the appropriate script template to be used (e.g., {{onym|he|...}} invokes {{Hebr}}). But even for Latin-script script entries, we want the lang=fr in our generated HTML. (Plus what msh210 says.) —RuakhTALK 18:56, 18 December 2011 (UTC)
  3. Symbol support vote.svg Support — Raifʻhār Doremítzwr ~ (U · T · C) ~ 02:36, 22 December 2011 (UTC)
I oppose all targeted linking of this sort.

[edit] straw poll — discussion

I don't think there is really a need to differentiate between whether tabbed languages is enabled or disabled. In both cases, there will always be some users wanting to use tabbed languages, and we should be able to accommodate them too. —CodeCat 19:38, 13 December 2011 (UTC)
What is intended by the phrase "links in definitions to non-English entries"? Since "links that are generated by other templates (like {{term}})" are excluded, this can't mean form-ofs . . . can someone point to an example entry that has such a link? —RuakhTALK 20:49, 13 December 2011 (UTC)
Good point: we shouldn't have such links. I'll remove that section.​—msh210 (talk) 21:13, 13 December 2011 (UTC)
I just realised something else. If we make the use of {{l}} required so that all links always link to a specific language, then it should naturally be required for all templates that generate links to do this as well. This means that the language parameters of {{term}}, {{form of}}, {{plural of}} and such would be required as well, unless all these templates are made to default to English (a bot should be able to trace erroneous uses that lack a language parameter). Maybe for definitions, a new template {{d|term}} can be used instead of {{l|en|term}} for ease of use, since the assumption is that it always links to a definition in English? —CodeCat 21:02, 13 December 2011 (UTC)

Is the intention to enable tabbed languages as an option for registered members, or the default for registered members, or admins, or the default for all? (I'm an admin, and I've turned off the tabs display.) Michael Z. 2011-12-13 21:56 z

Well, it's been an option for all registered users for a while. The intention (of some) is that it be the default henceforth, and for everyone.​—msh210 (talk) 23:15, 13 December 2011 (UTC)
What an appallingly nasty thing to perpetrate on unregistered users. DCDuring TALK 00:15, 14 December 2011 (UTC)
Thank you for giving your opinion. I suggest you try showing the tabbed interface to someone who is not a Wiktionary regular, and asking them whether it is simpler/easier to use than the normal "stacked" display. --Yair rand 05:10, 14 December 2011 (UTC)
Please see #Feedback from user-noneditors hereinafter. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 04:38, 21 December 2011 (UTC)
That is a great idea. Michael Z. 2011-12-15 16:17 z

Don't forget a point: readibility of internal pages is already bad, because of the heavy use of templates. Contributors coming from Wikipedia are expected to know the normal link syntax, not this l template. Why making things more complex unnecessarily? The current complexity is discouraging for new editors. Lmaltier 20:05, 16 December 2011 (UTC)

I agree with Lmaltier here. Using normal [[]] syntax should not break anything. This tabbed languages proposal (as I understand it) would mean that plain [[]] links would only show the English term, and this sounds a lot like breaking the current functionality.
That said, I also support the idea of more advanced editors using {{l}} or [[word#Lang|word]] to link to entries in specific languages. As Krun and others have pointed out above, the use of plain [[]] links can impede usability when the destination page is quite long. -- Eiríkr ÚtlendiTala við mig 20:22, 16 December 2011 (UTC)
I think the default of automatically going to English if it exists is kind of annoying. It often 'surprises' me and catches me off guard when I expect to be taken straight to the language I was working in. —CodeCat 23:32, 18 December 2011 (UTC)

[edit] Feedback from user-noneditors

I was prompted by Yair rand's call for user-noneditor feedback on the new tabbed-languages interface (see his post hereinbefore, timestamped: 05:10, 14 December 2011) to gather such feedback from my friends and coresidents. I surveyed eight people, asking each of them to look at [[alba]], first as an anonymous user and then logged in on my account, and then to tell me which presentation they preferred and why. That is a tiny and probably unrepresentative sample, so obviously I don't expect my findings to be authoritative, but perhaps they typify the perspectives of users who aren't affected by the often conflicting priorities of editors. So, without further ado…

  • All eight surveyed preferred the tabbed-languages interface to the stacked interface. Strength of preference was directly correlated with enthusiasm, with frequent Wiktionary users strongly preferring the change, whilst those who cared little expressed but a slight preference.
  • The tabbed-languages interface was felt to improve presentation by making it far easier to find the information for which one is looking, offering a clearer separation between languages, getting rid of enormous (and usually unwanted) tables of contents, combating "information overload", reducing the need for scrolling, generally looking "cleaner", and in many cases allowing whole entries to fit on a single computer screen (thereby eliminating scrolling).
  • Perceived drawbacks to the change in interface included its sacrifice of horizontal screen space, its obfuscation of the comparison of homographic cognates, and its hindrance of serendipitous term-searches. (This third drawback is of especial significance to our Random entry tool on the occasion that it brings up an entry with multiple language sections (which isn't that often, but is a pretty big deal for entries for CJKV characters); however, the random entry tool is, IMO ATM, virtually useless.)
  • Three of those asked suggested moving the language tabs from the side of the page to its top in order to free up horizontal space, although reflective discussion amongst them concluded that that solution was also problematic (especially in the case of an entry with very many language sections); either way, the "double toolbar" presentation was deemed odd, though perhaps unavoidable.
  • One person complained that the dark-grey–on–light-grey presentation of the names of unselected languages offered too little text/background contrast, at times making the names difficult to make out. She suggested that all the language names be written black, with emboldenment being the only distinction in the presentation of the text of the selected language and the unselected languages.
  • Finally, one other person gave a fairly detailed description of the kind of interface she would prefer. In it, the language tabs would occupy more or less the same position that they currently do in the tabbed-languages interface. Differently, however, the language tabs would be the size, colour, and general presentation of the section names shown in the stacked interface's tables of contents; only the selected language's (or, potentially, languages') subsections would be shown — otherwise, they would be hidden, with only the language names displayed. To mitigate the loss of horizontal space, long language names (such as Old High German) could be allowed to flow onto two lines in the list of tabs / table of contents.

That's the feedback I got, for what it's worth. I hope it's of some use to this discussion. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 04:38, 21 December 2011 (UTC)

I think this is very good and thank you for doing this! —CodeCat 11:54, 21 December 2011 (UTC)
I'm glad you consider it worth while. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 14:41, 22 December 2011 (UTC)
Three of those asked suggested moving the language tabs from the side of the page to its top in order to free up horizontal space, although reflective discussion amongst them concluded that that solution was also problematic. I am not sure if you are familar with the Windows user interface, but it has an interesting solution to the problem: if there are more tabs than can fit in a row, the tabs are evenly distributed across multiple rows. This would be an excellent solution to the problem and shouldn't be too hard to implement. -- Liliana 13:03, 21 December 2011 (UTC)
It can be difficult because scripts may have no easy way to find out how wide each tab is supposed to be, because it depends on the font. —CodeCat 15:19, 21 December 2011 (UTC)
I don't know a lot, but can't you just make the tabs simple text with a border around them, so they can naturally wrap to the next line? That should be the easiest way. -- Liliana 15:35, 21 December 2011 (UTC)
The main enduring problem with moving the tabs to the top that they realised was that, the more language tabs there are, the harder it becomes to single out one's intended language quickly; presumably, multiple rows will only exacerbate the problem. Instituting variable language-tab widths is also problematic, since it would give languages with long names (such as Isthmus-Mecayapan Nahuatl and Wangaaybuwan-Ngiyambaa) far greater visual prominence than languages with short names (such as Ido and Lao). — Raifʻhār Doremítzwr ~ (U · T · C) ~ 14:41, 22 December 2011 (UTC)
Arrowred.png
I'd like to reiterate the criticism of how unselected language tabs are shown -- grey on grey is not very good visibility. Bold black text on a white background for the active tab sounds good, as does unbolded black text on a light grey background for unselected tabs. -- Eiríkr ÚtlendiTala við mig 16:56, 21 December 2011 (UTC)
I've changed the styles a bit so that the text color of the unselected tabs are a bit darker and the backgrounds are a bit lighter. (I made the borders a bit lighter, too.) --Yair rand 22:09, 21 December 2011 (UTC)
Also, I'd like to point out that at some point the WMF is probably going to roll out the Athena skin here, which does away with the sidebar altogether. That's probably still a long way away, though, so I'm not sure it makes sense to start planning for that now... Either way, we're usually much more pressed for vertical space than horizontal space. --Yair rand 22:18, 21 December 2011 (UTC)
Thanks, Yair. Would it be possible to make the text for unselected tabs even darker? It's dependent on viewing angle on my monitor, but the current text and background colors are still rather close together. -- Eiríkr ÚtlendiTala við mig 22:41, 21 December 2011 (UTC)
I've darkened the color a little bit more, from this to this, but there's only so far it can go without losing the "inactive" effect. --Yair rand 22:28, 26 December 2011 (UTC)

The mention of multiple tabs reminded me of an entertaining old UI article on Tabbed DialogsMichael Z. 2012-01-04 05:32 z

[edit] Tabbed languages for the horizontally challenged

If discussion on the new language tabs is happening elsewhere - please say where.
As I become used to this layout I like it! Could consideration be given to making the language names on the tabs smaller (esp those not at the focus). This would allow more space for the meat. But well done for those behind the change. —Saltmarshtalk-συζήτηση 10:24, 14 December 2011 (UTC)

I rather like this layout too, although it is most unfortunate that so much horizontal space is wasted. But there is already a Beer Parlour section where it is being discussed here. – Krun 10:50, 14 December 2011 (UTC)

[edit] Romanian vocative case?

I've noticed that the standard templates for declining Romanian nouns produces a table that includes nominative/accusative and genitive/dative, but omits the vocative case, which is present in Romanian. I understand the reasons for this may be that it is not used as often in proper or formal speech, as its use has a "directness" about it in addressing people that comes across as rude or unrefined if the speaker is not intimately acquainted with the person they're addressing. Nonetheless, it is still a legitimate part of the language's grammar, a remnant from Latin, and I was wondering if this should be incorporated, if just for completeness, academic curiosity or for use in more informal situations, as it will still be heard fairly often in conversations. (For example "om"- "omule!", "băiat"- "băiete!", "văr"- "vere!", "soră"- "soro!" and even proper nouns or names like "Ion"- "Ioane!") Maybe there should be some usage note to describe its use? Or should it just not be included at all? Word dewd543 19:04, 14 December 2011 (UTC)

It should definitely be included! As for it being a remnant of Latin, some of the forms you describe (particularly the feminine -o) seem rather to be the result of Slavic inluence. Probably it's both, i.e. Slavic influence strengthened an existing phenomenon and added to it. Regardless, this should be included; we must be as complete as possible. – Krun 15:01, 15 December 2011 (UTC)
I agree. Of course I meant it was probably a higly modified remainder of that, with some forms clearly under heavy Slavic influence. But anyway, as I'm not acquainted with programming that into the templates, does anyone know how to do this? Also, since not every word has a vocative case that can be used, should it be an optional feature than can be added? Word dewd544 02:02, 16 December 2011 (UTC)

[edit] Reconstructed words

The words in reconstructed languages (e.g. Proto-Indo-European, Proto-Germanic etc.) do not meet the criteria for inclusion, so why WIktionary includes these words? for example: http://en.wiktionary.org/wiki/Category:Proto-Indo-European_nouns Thanks for response. Istafe 15:21, 15 December 2011 (UTC)

They exist only in the special Appendix: namespace. You could compare it to a printed dictionary, which would not have reconstructed words in the main body, but might have an appendix on the last 20 pages or so describing the reconstructions. -- Liliana 15:23, 15 December 2011 (UTC)
Yep, WT:CFI is for the main namespace. Mglovesfun (talk) 16:33, 15 December 2011 (UTC)

[edit] Why do we link to inflected forms?

There is something that occurred to me lately. Everywhere on Wiktionary we provide links to inflected forms, such as linking to goes in the headword line of go, linking to gaat in the conjugation table of gaan and so on. But is there really a point in that? Those pages are usually just form-of pages with no information except 'this is a certain inflected form of (headword)', and that's something the user would already know if they linked there from the headword's page. So I wonder why we link to those pages? Wouldn't it make more sense to just list them as plain words? —CodeCat 16:26, 17 December 2011 (UTC)

I sometimes find it very useful. It tells me at a glance what the form is, such as 3rd-person singular present indicative. Saves me from having to pore over a big table to pick out my form. —Stephen (Talk) 16:37, 17 December 2011 (UTC)
Inflected forms can have pronunciation information. --Vahag 17:13, 17 December 2011 (UTC)
In addition to pronunciation, inflected forms may also have (1) illustrative quotations specific to that form, (2) form-specific senses, much like plural-only nouns (this varies by language and part of speech). In any case, listing the forms without a link would make the user have to type in the entry name to get to the form page to find what's there, and I can't see that this would be an improvement over providing a link in the text. --EncycloPetey 19:37, 17 December 2011 (UTC)
Not to mention that participles inflect in many languages, requiring us to link to them. -- Liliana 19:39, 17 December 2011 (UTC)
I think, though, form-specific senses should be listed at the lemma form, anyway, or at least separately mentioned there with a separate link to the specific form (Say "(plural-only senses) See words."). Otherwise how is a user to guess that the sense even exists? —RuakhTALK 15:30, 18 December 2011 (UTC)
  • The answer is simple: we include all words in all languages. ---> Tooironic 09:22, 18 December 2011 (UTC)
  1. The links to the inflected forms from the lemmas are useful to contributors for determining whether the inflected form have been entered in the appropriate language.
  2. I don't think that users come to inflected form pages very often from inflection tables. Rather they come from casually typing in the form that is on their mind, possibly directly from a passage they have just read.
  3. If you are asking about why we link to the inflected form rather than the lemma under most circumstances, I think the inflected-form links are often the result of laziness or lack of knowledge about pipes, especially by casual contributors. OTOH, I can think of instances where the inflected form may be desirable. For example, in etymologies an inflected form is sometimes useful because it shows a spelling that is closer to that shown by a successor in the derivation chain than the lemma does. But even there the stem could be shown and the link still be to the lemma.
As more experienced users (us) have ways such as popups of getting to underlying entries rather than inflected forms, it is tempting to ignore the fact that most users are usually more interested in definitions (therefore the lemma) than in inflected forms. (If there is evidence to the contrary on the prevailing interests of users, I would like to see it.) I think veteran users should think carefully about whether a given link is better to the lemma or an inflected form. Furthermore for links to English sections care should be taken to link to the best section (Etymology n or PoS). DCDuring TALK 15:19, 18 December 2011 (UTC)

inline More specific linking works for stable entries. However, it does entail one rub that I've run into in the past -- as entries are edited, subsections within a single language are much more likely to change than the language headings themselves. I can be reasonably certain that a given term in a language will remain under that language, and thus linking like [[word#Lang|word]] will most likely be a successful and stable long-term solution. However, I am much less certain that any given etymology (or sometimes even POS, such as the not-so-distant discussion about adjectives vs. adjectival nouns in Japanese) will remain where it is, so pointing a user specifically to the internals of a language's entry from some other page is much less tenable, and presents a much uglier maintenance issue -- if that entry is edited and the etymology and/or POS are altered or reordered, I have no automatic way of telling that I need to change a link on some other page, and that link may now either point to the wrong etyl/POS or point to just the term.

Some other similar systems assign unique identifiers to subsections that remain stable even if the page is reordered, so long as that specific subsection is not deleted. I don't think MediaWiki allows for this, unless editors add such uniquely identifying link targets to each page manually. -- Eiríkr ÚtlendiTala við mig 17:22, 19 December 2011 (UTC)

We should not list directly to a POS section...ever. As Eirikr has pointed out, POS headers do change, so the links can change. Further, there is no guarantee that a single language will continue to have a set number of POS headers; there is the possibility that one etymology for one noun will because two etymologies for two nouns, are that the reverse will happen and two sections will be merged. Even if the etymology remains stable, a new sense for a noun might pop up where one did not exist before, creating a problem with linking the POS. So, pointing to a language section is the only stable way to point (and even that can have issues, as with the whole BSC deabte). --EncycloPetey 17:45, 19 December 2011 (UTC)

[edit] Open Call for 2012 Wikimedia Fellowship Applicants

Wikimedia Foundation RGB logo with text.svg

I apologize that you are receiving this message in English. Please help translate it.

  • Do you want to help attract new contributors to Wikimedia projects?
  • Do you want to improve retention of our existing editors?
  • Do you want to strengthen our community by diversifying its base and increasing the overall number of excellent participants around the world?

The Wikimedia Foundation is seeking Community Fellows and project ideas for the Community Fellowship Program. A Fellowship is a temporary position at the Wikimedia Foundation in order to work on a specific project or set of projects. Submissions for 2012 are encouraged to focus on the theme of improving editor retention and increasing participation in Wikimedia projects. If interested, please submit a project idea or apply to be a fellow by January 15, 2012. Please visit https://meta.wikimedia.org/wiki/Wikimedia_Fellowships for more information.

Thanks!

--Siko Bouterse, Head of Community Fellowships, Wikimedia Foundation 12:56, 21 December 2011 (UTC)

Distributed via Global message delivery. (Wrong page? Fix here.)

The Polish Wiktionary has developed an interesting feature. All of the headings appear in your preferred language. If you don't know Polish very well, you can see all the headers in your usual language. See, for example, pl:społeczeństwo. It looks like each header calls a template, such as temp:etymologia. —Stephen (Talk) 16:19, 21 December 2011 (UTC)

That is pretty amazing, but it would require us to translate all the headers. Do we have the staff to do that? -- Liliana 16:31, 21 December 2011 (UTC)
The heading appears in Polish for me, how do I change it? —CodeCat 16:37, 21 December 2011 (UTC)
I think you need to change your language in the preferences. -- Liliana 16:43, 21 December 2011 (UTC)
I don't quite see the point. I mean, sure, I've looked at Wiktionaries in languages I don't know once in a while, but very seldom. I imagine that people who don't edit a Wiktionary look at unfamiliar languages' Wiktionaries even less often than I. They're really meant for people who know the language. I don't think we ought to offer our headings in languages other than English.​—msh210 (talk) 19:39, 21 December 2011 (UTC)
I do look at other language wiktionaries and I think the feature is worth considering, it's great, actually. If mean that Wiktionary should serve people from any background. Don't know if it envolves a lot of work, though. --Anatoli (обсудить) 22:51, 21 December 2011 (UTC)
Let me describe for people wanting to experiment:
  1. Open pl:społeczeństwo. Have a look at headers.
  2. Click on Preferencje (Preferences).
  3. Change "Język interfejsu" from pl-Polski to en-English.
  4. Go back to pl:społeczeństwo, refresh the page and note that headers are now in English.
I don't know what it's using but the headers can be translated to any language listed. --Anatoli (обсудить) 23:04, 21 December 2011 (UTC)
I think it looks like a lot of work for very little benefit. When I look at pl:społeczeństwo and see headers in English, I still don't see anything particularly useful to me as someone who reads no Polish. I recognize the pronunciation because I know IPA when I see it - having the header in English doesn't help me there. Seeing "definitions" and "examples" in English doesn't help because I can't read them anyway. I don't need the English headers to recognize the inflections and translations - I'd know that's what they were even if the headers were still in Polish. In short, seeing the headers in English doesn't let me as a non-Polish speakers know anything more about the word społeczeństwo than seeing the headers in Polish would have. —Angr 23:12, 21 December 2011 (UTC)
It's easy to comment lightly on this if you stay in your comfort zone and use the English Wiktionary alone - your native language. When I look up words in Vietnamese or Thai Wiktionary I have trouble identifying parts of speech as I don't know all their names. If I look at Vietnamese vi:bó or Hindi hi:हिन्दी, I want, at the very least to know whether it is a verb or a noun before I try to understand examples or grammar. It may be too much initial work, I have no idea but the benefits are there. --Anatoli (обсудить) 02:30, 22 December 2011 (UTC)
I don't think it entails any translation work. I think it uses little MediaWiki files such as MediaWiki:Delete that are found on each Wikipedia and Wiktionary, so that the translations are actually provided by the Wiktionary of that particular language. That is, the English headers you see in the Polish Wiktionary are generated by us here in the English Wiktionary. —Stephen (Talk) 05:02, 22 December 2011 (UTC)
I thought so, thanks. I meant there could be work for programmers, editors, bot-writers, etc, if the idea were accepted. --Anatoli (обсудить) 05:54, 22 December 2011 (UTC)
  • I thought the problem with doing that is that it breaks page caching (not sure). Anyone know anything about that? --Yair rand 06:33, 22 December 2011 (UTC)

How would this affect section linking (i.e., links like eavesdrop#Noun)? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 14:14, 22 December 2011 (UTC)

I too fail to see the point, if you want something in English, use the English Wiktionary. Mglovesfun (talk) 10:47, 1 January 2012 (UTC)

I've gone through and internationalized all the Vietnamese Wiktionary's headers (other than the language headers, which are more complicated there) and various common inline templates into English and Spanish. I'll add more languages if time permits. Unlike the Polish Wiktionary's implementation, we're using the MediaWiki: namespace. Compare (vi) to "bó" (en) and (vi) to "人" (es). – Minh Nguyễn (talk, contribs) 12:28, 2 January 2012 (UTC)

You did a great job! This makes vi.wikt much easier for us to use now. Thanks. —Stephen (Talk) 12:43, 2 January 2012 (UTC)
You beat me to it, Stephen, I was gonna say the same. What a great feature! --Anatoli (обсудить) 13:13, 2 January 2012 (UTC)

[edit] Removing the naming distinction between different types of categories

I just came across Category:en:Onomatopoeia and I wondered why it wasn't an etymology category. But then I realised that it's not only an etymology category, because words that were coined as onomatopoeia might still be. And so now I wonder why we really distinguish different types of category so rigidly. We've had many votes and moves simply to move categories from one naming scheme to the other. And honestly I think it's getting a little silly. On the Dutch Wiktionary there is no distinction: there is nl:Categorie:Werkwoord in het Engels ("Verb in English") for English verbs and nl:Categorie:Natuurkunde in het Engels ("Physics in English") for physics-related terms. So I'd like to propose that we just use one single naming scheme for all language-specific categories, no matter what they are supposed to contain. —CodeCat 23:12, 21 December 2011 (UTC)

It's not exactly related, but Wiktionary:Votes/2011-10/Categories of names 3 failed not too long ago, one of the reasons being that many people felt the names were too long. -- Liliana 23:15, 21 December 2011 (UTC)
I'm aware of that, but that's really just a technicality. If we can first find a consensus on whether we need to change this, then we can maybe worry about how. —CodeCat 23:21, 21 December 2011 (UTC)
Part of the problem is that it's not really clear what a category is supposed to contain, either a priori (figure out what categories we want and why, figure out what they should contain in order to serve that purpose) or from the names (see what a category's name is, figure out what it should contain). Is Category:Verbs in English for English verbs (have), or English words relating to verbs (verbification), or both? Is Category:Physics in English for words used in physics (center of mass), or words used about physics (astrophysics), or both? I think we should start from the a priori side: what classes of pages is it useful to have categories for? (N.B.: when I say that it's "not really clear" I mean both that it's not clear to me, and that it doesn't seem to be clear to the community as a whole; there may be some people with specific opinions and clear ideas in their heads, but I don't think there's anything resembling clear consensus.)RuakhTALK 00:07, 22 December 2011 (UTC)
If all the category names were of the same type, we could settle possible problems about their content (or existence) separately. Now all such attempts lead into Kafkaesque arguments. I don't think the names of categories can ever accurately describe what's in them. I'd vote for short, simple names for all (en:Type for everything is my favorite). The content can be described on top of the category: "This category contains English verbs", etc. Newcomers are not likely to know at once what a category is, let alone bother about the difference between topics and parts of speech. When I came here, I didn't know or care about the difference between a category, appendix, or glossary. Dividing categories into two types with different names just makes it more complicated. --Makaokalani 17:02, 22 December 2011 (UTC)
This year's Christmas Competition is announced and is open to all contributors!
--EncycloPetey 20:38, 23 December 2011 (UTC)
Star-42.gif

[edit] language linking in translation sections

Ahhh, the old debate again...

But really, the current situation is clearly unsatisfactory. We link some languages and unlink others by totally arbitrary and subjective criteria, and as seen on Wiktionary talk:Translations/Wikification, this causes repeated debates and fierce fights about whether to link one language or not. That can't happen in a serious dictionary like we are.

Really, we should decide on either linking all languages, or unlinking all. Any of these are better than the current situation. Of course, both options have arguments for and against:

Unlinking all languages

  • Arguments in favor:
    • Anyone who wants to know about a particular language can always look it up in the search bar
    • It makes template matters greatly easier, thereby reducing server load
    • It reduces the potential for confusion, as a new reader may not know where to click if both the language name and the translation are bluelinked
  • Arguments against:
    • The argument above could also be used to unlink all words in definitions, and we're obviously not going to do that
    • At least in monobook and vector the search bar is a bit out of the way

Linking all languages

  • Arguments in favor:
    • Linking language names makes it easier to look them up, as it requires just a click
    • Even language terms like German may be useful to the casual reader
  • Arguments against:
    • It is still not entirely consistent, as language names cannot link to themselves (for example, the French translation of French)
    • If not all entries on language names exist, it creates a rather "colorful" mix, ex. on water.
    • The bluelinks can be confusing if the language sense doesn't exist in the respective entry

-- Liliana 13:50, 26 December 2011 (UTC)

I prefer not linking the names, but if we do, it may be better to link to Wikipedia instead. If someone sees a name in a translation table, they will already know that it's a language, so our simple definitions won't help them much. Most likely they will be looking for more encyclopedic information, such as where the language is spoken and what family it belongs to. —This unsigned comment was added by CodeCat (talkcontribs) 14:24, 26 December 2011 (UTC).
Whilst I generally oppose language-name–linking, arguments 1 and 2 against it could be rendered void by (1) using {{l}} for all language-name links, and (2) creating the entries for the language names in question. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 16:40, 26 December 2011 (UTC)

the following comment was moved by Mglovesfun (talk) 13:27, 31 December 2011 (UTC) after being posted in a separate section below

A little background, {{en}} contains only English (aside from stuff wrapped up in noinclude) while {{ca}} contains {{{l|[}}}{{{l|[}}}Catalan{{{l|]]}}}. Effect: {{ca}} displays Catalan, while to remove the wikilink, one uses {{ca|l=}}. On the deletion debate WT:RFDO#Template:language, it has been called into question whether this additional feature adds enough value for it to be kept. The only real use it automatic linking whilst subst:ing, which is mainly used in translation tables. In the past I've personally used it for descendants as well. I believe (and I'd like confirmation from someone more advanced than me) that removing the wikilinks all together should improve server performance (pages will load faster, things like that). Mglovesfun (talk) 10:40, 31 December 2011 (UTC)

If there's such a support for this, should it be subject to a vote? -- Liliana 23:36, 31 December 2011 (UTC)

Straw poll? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 00:03, 1 January 2012 (UTC)
If thou desirest... go ahead. -- Liliana 08:17, 1 January 2012 (UTC)
I don't want all languages unlinked, just remove the links from the templates themselves. I'm not proposing that a bot change every instance of [[Catalan]] to Catalan. Mglovesfun (talk) 13:32, 2 January 2012 (UTC)

I would suggest that the language names link to the category for that language, instead of the entry for the language name in English. I think that would be more relevant in these cases. In principle I would favour unlinking all languages, but in cases like water (good example!), where I haven't even heard about most of the languages, it is very nice to have a link to something that can tell me more about that language (and categories, as far as I can see, tend to have many relevant links to further reading about that language). Jon Harald Søby 23:42, 6 January 2012 (UTC)

[edit] Poll: language linking in translation sections

Still no straw poll? Then let me begin:

I want all language templates to be unlinked

  1. Symbol support vote.svg Support Liliana 00:34, 6 January 2012 (UTC)
  2. Symbol support vote.svg SupportCodeCat 20:16, 6 January 2012 (UTC)
  3. Symbol support vote.svg Support --Vahag 22:05, 6 January 2012 (UTC)
  4. Symbol support vote.svg Support. But I don't oppose linking in translation sections, as one does not imply the other. Mglovesfun (talk) 20:17, 6 January 2012 (UTC)
  5. Symbol support vote.svg Support. (I would also be O.K. with linking all languages, but I think that linking none is preferable. But the linking-only-uncommon-languages approach has had bizarre consequences that simply are not tenable.) —RuakhTALK 00:56, 7 January 2012 (UTC)
  6. I weakly support, because of the server-load issue and the low utility of such links (i.e. who looks up a language name in a dictionary if he knows it's a language name? In an encyclopedia, maybe, but that's not what we currently link to).​—msh210 (talk) 23:58, 8 January 2012 (UTC)

I want all language templates to be linked

I want everything to stay as it is

  1. Symbol support vote.svg Support Additionally, there are a number of languages with similar names that are regularly confused, such as Scots and Scots Gaelic. --EncycloPetey 00:08, 9 January 2012 (UTC)

I have another opinion not accounted for here

I don't care

  1. Symbol support vote.svg Support, though I guess this more or less constitutes an abstention. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 20:12, 6 January 2012 (UTC)
    I think it would be more convenient, for anyone trying to make sense of this straw poll, if you were to summarize your own position. (Unless you're actually in the "I don't care" camp. I had to make a guess, when I split "I don't care / I have another opinion not accounted for here" into separate sections.) —RuakhTALK 00:56, 7 January 2012 (UTC)
    Nope, "I don't care" is more appropriate, I think. :-)  — Raifʻhār Doremítzwr ~ (U · T · C) ~ 01:22, 7 January 2012 (UTC)
  2. Symbol support vote.svg Abstain. I refrain from taking a stance in this poll. --Dan Polansky 07:56, 14 January 2012 (UTC)

[edit] Result

So, what's the result? Should it be just done? -- Liliana 20:30, 20 January 2012 (UTC)

I think so... —CodeCat 20:31, 20 January 2012 (UTC)
No. An informal straw poll here doesn't override a vote.​—msh210 (talk) 03:50, 23 January 2012 (UTC)

[edit] Requests for verification AKA RFV, its template text

The text of the template {{rfv}} is IMHO incorrect, in disalign with common practice at RFV. It reads as follows:

  • "If the information here cannot be verified or does not meet our inclusion criteria then it will be deleted. ..."

But RFV is not for verification of any information (such as pronunciation and etymology) and any failure to meet inclusion criteria (such as sum-of-partness aka compositionality).

This goes back to 2006, when the template text read as follows:

  • "It has been suggested that this entry does not meet Wiktionary's criteria for inclusion. ..."

I propose something like this:

  • "If this entry cannot be attested, by showing that it meets attestation criteria, it will be deleted."

The proposal refers to entry rather than to an entry or a sense, as it pertains to {{rfv}} rather than {{rfv-sense}}.

I would boldly edit the template, but I cannot, as only sysops can edit the template. --Dan Polansky 10:58, 30 December 2011 (UTC)

That makes sense; but probably it should say "this term" rather than "this entry"? —RuakhTALK 22:02, 30 December 2011 (UTC)
"This term" sounds good.
MG has edited the template so that it reads "If the definition or definitions here cannot be verified or does not meet our inclusion criteria then it will be deleted." This is still wrong, as not any inclusion criteria are concerned. I still propose what I have proposed above, using "this term" instead of "this entry":
  • "If this term cannot be attested, by showing that it meets attestation criteria, it will be deleted."
--Dan Polansky 15:56, 31 December 2011 (UTC)

[edit] first noun of a noun-noun compound is not (necessarily) an adjective

Forgive me if this is the wrong forum for this, I'm a casual wiktionary user only. I've noticed entries for "Adjective (not comparable)" for many words which are not adjectives, but which could easily be construed as adjectives, since they commonly occur as the first noun in a noun-noun compound. Some examples:

I know this distinction can be a bit subjective sometimes (especially for materials like acid/bamboo/etc), so before I go editing like mad, I wanted to know if there is a policy on this. —This comment was unsigned.

  • Nouns that are used to modify another one are still nouns. We call this attributive use. They should not (normally) be defined as adjectives. SemperBlotto 08:43, 2 January 2012 (UTC)
Can someone provide an example of a word which has been correctly marked as such? --72.14.228.129 17:38, 5 January 2012 (UTC)
What, a word marked as a noun, you mean? Mglovesfun (talk) 17:40, 5 January 2012 (UTC)
I have for some time advocated grammatical tests as the principal means to determine whether a word fell into a given PoS category. For adjectives, see WT:English adjectives. I think there is some agreement about this in the sense that a noun whose sole adjectival trait is that it is used attributively but whose entry has an Adjective PoS section usually gets that section removed when challenged at WT:RFD, which is our standard forum for handling such matters. One very significant proviso is that, if the term is used attributively with a meaning that does not clearly and directly correspond to a legitimate noun sense, then an Adjective section containing that sense should remain. An example, I think, of this proviso in operation would be acid#Adjective, for which at least the acid rock sense seems to me to be distinct from any noun sense that comes to mind. With the possible exception of a cappella, each of the others seems, at first blush, worth an RfD challenge to the Adjective section IMHO. DCDuring TALK 19:06, 5 January 2012 (UTC)

[edit] Middle Spanish

¶ Hullo. I wou'd like to creäte entries for Middle Spanish, but we do not possess the necessary categories or an index for Middle Spanish; there does not seem to be an ISO code for it according to its Wikipedia article, so an appendix may be necessary, unless it is possible to make our own code, as for Simple English. In essence: I desire to start entries for a particular language but we do not have the necessary resources right now, so I wou'd like to ask if somebody cou'd please provide them for us. I thank you. --Pilcrow 02:22, 3 January 2012 (UTC)

Considering it wasn't spoken so long ago, does it really merit separate treatment? —CodeCat 02:48, 3 January 2012 (UTC)
I'd create Middle Spanish entries under the Spanish header and tag them "obsolete" if they're distinct from Modern Spanish. The differences aren't as great as between Middle English and Modern English, especially not in the written language. —Angr 18:48, 4 January 2012 (UTC)

[edit] Norwegian Bokmål/Nynorsk

Why are there separate headers for Norwegian, Norwegian Bokmål, and Norwegian Nynorsk? I propose that these be merged under the common header 'Norwegian' and indicated as either Bokmål or Nynorsk when necessary. --JorisvS 16:15, 3 January 2012 (UTC)

They inflect differently though, so it isn't as easy as having two context tags Bokmal and Nynorsk. -- Liliana 06:40, 4 January 2012 (UTC)
Okay, but since when are we in the habit of having multiple headers for one language, even if context tags aren't sufficient to handle the differences? Note also that there exist not two, but three different headers for Norwegian. --JorisvS 11:22, 4 January 2012 (UTC)
I know almost nothing about the issue, but it is unresolved here; the templates {{nb}} and {{nn}} are often used under the Norwegian header (code {{no}}). Most of the Norwegian entries in User:Yair rand/uncategorized language sections/Not English aren't uncategorized, they just contain the 'wrong' language code. So some sort of real resolution would be nice. Mglovesfun (talk) 11:52, 4 January 2012 (UTC)
Yes, that's why I started this discussion. I think it's obvious that these should be under a common header and that this header should be ==Norwegian==. I'm not knowledgeable enough about Norwegian to have an opinion about the remaining issue(s) (inflection, as I understand it).--JorisvS 13:57, 4 January 2012 (UTC)
Arguably it should be the opposite way around, have separate headers for Bokmal and Nynorsk (thus eliminating the common Norwegian header). -- Liliana 16:28, 4 January 2012 (UTC)
Why? Why separate headers for what is essentially the same language? --JorisvS 20:07, 4 January 2012 (UTC)
The question of whether Bokmål and Nynorsk are different languages is certainly non-trivial. Given that they have separate Wikipedias and separate ISO-639-1 codes, I'd keep them separate unless native speakers argued otherwise.--Prosfilaes 09:36, 5 January 2012 (UTC)
They are not just different spelling forms but they also have different words in some cases, such as Bokmål dere and Nynorsk dykk. From what I understand, the two standards are based on different dialects, with Bokmål being based mostly on the urban dialects of Oslo and Nynorsk centered more around the west coastal area. Maybe we can look at how other Wiktionaries solve this problem. I know that Dutch Wiktionary treats Bokmål as 'Norwegian' and has Nynorsk as a separate language. —CodeCat 12:38, 5 January 2012 (UTC)
That's quite biased, though. -- Liliana 13:36, 5 January 2012 (UTC)
That's true, but in everyday practice most people who learn 'Norwegian' as a foreign language learn Bokmål, and never encounter Nynorsk at all. So we could either perpetuate this existing bias, or be correct at the cost of possibly confusing our users. —CodeCat 13:39, 5 January 2012 (UTC)
I support this bias as well. Unless it's Nynorsk, we are talking about Norwegian. Google Translate works with Bokmål but calls it Norwegian. If the words are spelled identically, mark them as Norwegian, otherwise add Nynorsk:
Translation of autumn into Norwegian:
 * Norwegian: {{t+|no|høst|m}} *: Nynorsk: {{t|nn|haust|m}} 
In short, I support to have two headers - Norwegian and Nynorsk or merged into Norwegian where practical. Bokmål should be merged into Norwegian and {{nb}} should not be used, only {{no}} and {{nn}} in some cases. --Anatoli (обсудить) 00:37, 6 January 2012 (UTC)
Are they really sufficiently different to hamper intelligibility? As I understand it these are different standard languages with separate language codes. We have another situation where there are different standard languages with separate language codes: Serbo-Croatian, whose standards were merged some time ago. --JorisvS 18:35, 5 January 2012 (UTC)
We just need to have n-nn and n-bo and in the case that it is not known work on updating them into one or the other, and not allowing any new entries that are n-unspecified. Norweigian is a special case but how to treat it is not, it is universally treated as two languages on every operating system, translator, website, or wikipedia I have ever seen. They just happen to be spoken very similarly to the point of mutual intelligibility, even more pronounced than chinese dialects.Lucifer 18:52, 5 January 2012 (UTC)
Chinese dialects are actual dialects, though. Bokmål and Nynorsk are just different spelling systems, you can't really 'speak Nynorsk', even though some people still try. The spoken language and the written language aren't necessarily related. Many people speak Norwegian dialects, and might write in Bokmål even though Nynorsk more closely matches their dialect. And in the same way, urban people who speak a dialect that more resembles written Bokmål might still prefer to write in Nynorsk (although that's rare). —CodeCat 19:02, 5 January 2012 (UTC)

Please note that we're primarily a written dictionary, not a spoken one. Thus spelling differences are of much greater importance to us than pronunciations across dialects. -- Liliana 00:22, 6 January 2012 (UTC)

But why would we treat different spellings as separate languages? Would you support treating Pinyin as a separate language from Mandarin? Or Cyrillic Serbo-Croatian as separate from Latin Serbo-Croatian? The fact that both are standardised shouldn't matter either; the Valencian standard is distinct from standard Catalan, but we call both Catalan (although that's a debate in itself). Or what about Simplified Chinese and Traditional Chinese, which is actually very similar to the Bokmål-Nynorsk issue? In the end, what Wiktionary represents is a language. Bokmål and Nynorsk are not languages, they are different representations of one group languages called Norwegian. —CodeCat 00:58, 6 January 2012 (UTC)
We merged Romanian and Moldavian, Serbo-Croatian varieties, so the same could be done with Norwegian and Albanian forms. --Anatoli (обсудить) 01:17, 6 January 2012 (UTC)
Maybe it also helps to look at how Norwegian Wiktionary itself treats Norwegian. They treat it as one language, but add qualifiers after words when necessary to specify whether the form is Bokmål, Nynorsk or both. This implies that Norwegian speakers themselves treat it as one language, not two. —CodeCat 01:22, 6 January 2012 (UTC)
True. The Chinese also treat Mandarin and Chinese as one language ({{zh}} links to Wiktionary, which is entirely in Mandarin {{cmn}}) but that's a different story. --Anatoli (обсудить) 01:41, 6 January 2012 (UTC)
When this topic was last discussed, almost a year ago, the consensus was to treat them as two languages: Norwegian Bokmål and Norwegian Nynorsk (Wiktionary:Beer parlour archive/2011/February#Norwegian headings). However, nobody was volunteering to sort up the existing entries. (Here is one example.) The user who most actively supported two headings at the time was Njardarlogar, who has made some 700 edits in the last year, primarily creating new entries for Norwegian Nynorsk. --LA2 23:38, 8 January 2012 (UTC)

Using one header is just confusing, it will lead to tags here tags there; tags all over. Some words are more relevant in one language form than the other, and many words have no equivalents at all in the other language form. I cannot think of any Norwegian [lanugage] dictionary that ever contained both Nynorsk and Bokmål, that would be pointlessly messsy. I support the the previous consensus which landed on splitting the two language forms completely. This would leave the header Norwegian for dialectal words only. Regarding similarity, the same argument can be used for all the Scandinavian languages; they are very similar. Njardarlogar 10:09, 9 January 2012 (UTC)

[edit] AWB access

I would like to use AutoWikiBrowser to extract audio file names from the articles in this category. Can an admin please add me to the check page? I am an admin on English Wikipedia. I don't intend to make any changes, just browse the category. Thanks. Ganeshk 02:14, 5 January 2012 (UTC)

Why do you want to do that? Mglovesfun (talk) 13:46, 5 January 2012 (UTC)
For use with translation to the Tamil Wiktionary. Please see the request here. I would use the AWB access to extract the content into a CSV file and allow the Tamil Wiktionary folks to upload them. I plan to use custom modules as shown in examples here. Ganeshk 12:02, 6 January 2012 (UTC)
When you say 'extract', do you mean 'remove' or something else? Mglovesfun (talk) 15:25, 6 January 2012 (UTC)
I would parse each page in the category and regex scrape the audio file name and append it to a csv file on my computer. The page will then be skipped with no changes. Nothing will get removed from the page. Ganeshk 17:21, 6 January 2012 (UTC)
I see ok, done consider it done. Mglovesfun (talk) 17:24, 6 January 2012 (UTC)
Thanks! Ganeshk 00:42, 7 January 2012 (UTC)

The pronunciation that I added was it right? I am not sure.Lucifer 18:26, 5 January 2012 (UTC)

No idea, but I'd recommend Talk:anachronism for this sort of question, you can also use {{rfv-pronunciation}} which links to the talk page. Mglovesfun (talk) 19:45, 5 January 2012 (UTC)
It's not possible to say whether your pronunciatory transcription was correct without the intended accent being denoted (by {{a}}); however, if you intended to give an RP transcription, you were correct except for the secondary stress. I've tweaked the transcription per the OED [2ⁿᵈ ed., 1989]. BTW, as Martin notes, this isn't really the forum for this; at the very least, this is more appropriate to the Tea Room. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 21:05, 5 January 2012 (UTC)
The audio is accurate, yes. —Internoob 04:37, 7 January 2012 (UTC)

[edit] More languages to add?

I've been recently working to improve the coverage of the languages of Oceania on Wiktionary (which is generally pretty bad), and I realized that we have no words at all for a bunch of languages. These are living languages that should have around a few thousand native speakers each.

  • Wallisian {{wls}}
  • Anuta {{aud}}
  • Tikopia {{tkp}}
  • Rennellese {{mnv}}
  • Pukapukan {{pkp}}
  • West Futuna (often called Futuna-Aniwan, but Wiktionary already calls Futunan 'East Futunan') {{fut}}
  • Niuafo'ou {{num}}

Should I add them, and if so, how? Metaknowledge 20:59, 8 January 2012 (UTC)

Add them as you have been doing. What in more detail are you asking? How to look up ISO 639 codes? How to add languages that don't have ISO 639 codes? Mglovesfun (talk) 21:28, 8 January 2012 (UTC)
Sorry for any confusion. I want to know the format for making the Category:Language name page and for any necessary templates. Also, some languages have different codes in ISO 639-1, ISO 639-2, and ISO 639-3, and I want to know which to use.Metaknowledge 21:55, 8 January 2012 (UTC)
Added those codes I could figure out. -- Liliana 21:58, 8 January 2012 (UTC)
The ISO 639-1 code is used if there is one, otherwise ISO 639-3 is used, I think. (A quick way to determine a language's code is to type the name into the language field of the "Add translation box".) The format for Category:Language name is {{langcatboiler|language code}}. The countries in which the language is spoken can optionally be added as parameters two and on. --Yair rand 22:15, 8 January 2012 (UTC)
I've also added Pukapukan's code to the list. --JorisvS 16:25, 10 January 2012 (UTC)

[edit] Categories in need of cleanup

Many pages are in the request category, but need no longer be there.

http://en.wiktionary.org/wiki/Category:Requests_(English) —This unsigned comment was added by Dragonh4t (talkcontribs) 01:54, 10 January 2012‎.

As in how? Explain. -- Liliana 06:13, 10 January 2012 (UTC)
Many of the articles in the categories have definitions or etymologies
People sometimes put them in those categories because they think that the definitions and etymologies are incomplete even if they're not missing entirely. —Internoob 23:51, 10 January 2012 (UTC)
Yeah, I know that, but what about words like deep or ad? The definitions seem fitting. Also, where can I go to learn more about editing. I know there are pages on it, but this is my first time using any web design type thing.

[edit] Category:Old South Arabian place names

It seems, these are duplicates: Category:Old South Arabian Place names and Category:Old South Arabian place names (the P is different). And shouldn't it be Category:sem-srb:Place names? Would someone like to tidy up? --MaEr 18:57, 10 January 2012 (UTC)

[edit] Tabbed Languages trial is over

The admin-only Tabbed Languages trial has come to an end. For those who still want to use it, it is available opt-in in the Gadgets section of Special:Preferences.

So what's next? A vote on whether to enable it by default for all users? More testing? --Yair rand 22:21, 10 January 2012 (UTC)

I like it but I would like if it didn't switch to English automatically. Often when I'm working on a language, I would prefer to see that language each time and not have English pop up every time... —CodeCat 01:03, 11 January 2012 (UTC)
Okay, I've lowered the priority of English and Translingual, so that the "remembered" language takes priority over them, but English and Translingual are still higher up than targeted translations languages. Does anyone object? --Yair rand 23:29, 19 January 2012 (UTC)
I'm not sure what you mean with targeted translation? —CodeCat 23:45, 19 January 2012 (UTC)
I listed the old hierarchy at #Default tabbed non-English language. The new hierarchy places the "remembered" language two places higher. Targeted translations languages refers to the languages selected using the little "Select targeted languages" button at the top of translation tables. --Yair rand 23:32, 22 January 2012 (UTC)
Oh I see, thank you. —CodeCat 23:35, 22 January 2012 (UTC)
There was a bunch of feedback (including specific suggestions) in varios section of this page and elsewhere. I wish I had time to collate it: perhaps someone can?​—msh210 (talk) 02:27, 11 January 2012 (UTC)
List of suggestions (I probably missed some):
  • msh210 suggested that each language's content should start vertically positioned near the language name.
    I meant that specifically for when the language header is clicked to load the language's content, not when linked to from elsehwere. That way, the content is near what was clicked to get to it, and no scrlling is necessary. If linked to from elsewhere, then the content should be on top, as it is.​—msh210 (talk) 17:35, 11 January 2012 (UTC)
  • DCDuring suggested that we should be able to select whether the Translingual or the English section merits priority placement, perhaps by placement of a template.
    • My own opinion on this is that English should just always be given higher priority, but a vote on that failed, so...
    • (Another point: The way the script is currently written, if there's both an English and a Translingual section on the page, and the English section is above the Translingual section, the English section is displayed at the start.)
  • Mzajac suggested making the standard page index (I assume this means the TOC?) float at the top-right, only showing sub-section links for the currently-selected language.
    • My personal opinion on this is that this would cause problems for our existing right-floated content, and probably wouldn't be worth it.
      • I've had the TOC floated top-right for years, and it doesn't cause any problems (it does reveal when some right-floated content is out of order, which I routinely fix). Floating the TOC/tabs on the right resolves the wasted space and misalignment issues with the tabs, and collapsing the TOC's non-displayed language sections would simplify the TOC for the reader, while retaining the section links for a very long entry. This would be a good combination of existing and new features, a less jarring design change, and make switching between the two schemes much smoother for new and experienced readers and editors. Michael Z. 2012-01-18 16:10 z
  • Doremítzwr suggested including a show all / hide all toggle atop the column of language tabs.
  • Saltmarsh suggested shrinking the language names on the tabs, especially those not at the focus, to allow more horizontal space for the substance of an entry.
  • And a suggestion from Codecat: "I think it would be a good idea to place the tabs horizontally in the place where the page name displays now. Since we use headword lines, we don't actually need the page name to be there anyway..."
I am hesitant to make substantial design changes to Tabbed Languages at this point, since the current design was made by an actual professional designer (WMF Senior Designer Brandon Harris, AKA User:Jorm) and I'm rather afraid that if we start fiddling with lots of things without a designer helping, the result will be very messy. I've asked on Jorm's talk page if he'd be able to participate in the discussion here about changes to the design. --Yair rand 04:05, 11 January 2012 (UTC)

[edit] Stale requests for cleanup

diegesis and other pages have request for cleanup links that when clicked on reveal there is no entry for that word in the RFCU page. How does this happen? Did the person who put the RFCU link in the word page forget to add an entry on the RFCU page? Or do the entries on the RFCU page age and disappear? Can I delete the RFCU tag on the word's page when this happens as there's no longer any way to tell what the requester originally wanted? -- dougher 04:07, 11 January 2012 (UTC)

Very many people add the tags without ever opening a discussion header on WT:RFC. In many cases it's obvious what needs to be fixed, in this one it wasn't (I fixed it anyway). Be sure to look at the history, the tag may just have been added years ago with nobody coming along to remove it once the page is fixed. -- Liliana 06:39, 11 January 2012 (UTC)
...and sometimes there was discussion at RFC and the discussion, never resolved, was archived/deleted anyway. Sometimes whatlinkshere (e.g.) will help find such conversations. Incidentally, it's RFC: RFCU is something else entirely.​—msh210 (talk) 17:40, 11 January 2012 (UTC)

[edit] Splitting the Beer Parlour

A while ago I suggested using subpages for different BP discussions, so that they could be more easily followed. That never went anywhere so I'd like to suggest something else instead. It's obvious that the BP is very busy and it's hard to follow discussions because many older discussions are missed out on when new ones are added. Splitting it into two or more distinct discussion pages would slow down the rate of posting somewhat and would make it easier to keep track of discussions, which would in turn allow for better participation. I don't know how it should be split, but since policy discussions are often relatively long, splitting them off into a separate page might be a good start. —CodeCat 19:53, 11 January 2012 (UTC)

Much better idea: Make the BP use Liquidthreads. -- Liliana 23:19, 11 January 2012 (UTC)
Um yeah, please don't! Equinox 22:00, 12 January 2012 (UTC)

[edit] "color/colour" etc.

e.g. at shade: "A postage stamp showing an obvious difference in colour/color to the original printing and needing a separate catalogue/catalog entry." I really hate this pandering to spelling pedants, which makes the definition look stupid and unprofessional. It doesn't fix the problem because they could still argue about which form comes first (before the slash). Isn't there any better way? Equinox 22:00, 12 January 2012 (UTC)

Well, we could just say that what ever was there first sticks and no one is allowed to change it (isn't this what we do now?). Or we could just pick one spelling and use it consistently. Or we could use something like {{#ifexpr:{{NUMBEROFARTICLES:R}} mod 2 = 1|color|colour}} to have it randomly alternate... --Yair rand 22:22, 12 January 2012 (UTC)
Would CURRENTTIMESTAMP be cheaper?​—msh210 (talk) 09:12, 13 January 2012 (UTC)
I have no idea. --Yair rand 00:11, 16 January 2012 (UTC)
We could also (which I think was proposed before) have some kind of user-level setting that specifies which set of spellings to prefer, but I doubt it's worth doing for so small a group of pedants, and it would also get complicated, as there are many more "Englishes" than just UK and US. Equinox 00:14, 16 January 2012 (UTC)
Yes, it should just be one or the other. As "color" is used elsewhere in that same entry, it should be just "color" in those definitions too. I'm sure I've seen a guideline for it somewhere. Pengo 01:46, 13 January 2012 (UTC)
Something like a combination of what Yair and Pengo said sounds reasonable to me. Specifically: Keep whatever the first edition uses unless there's good reason to switch. Good reason to switch includes if you're adding more to the entry than it has already, and doing so in the opposite dialect. For example, we Hebrew editors can't decide on ch or kh as transliteration for a certain letter, so each of us does what he wants. But we leave an entry with its current transliteration scheme. But if an entry has one POS section and I add two more as big as it, I will without hesitation make the existing one use my transliteration scheme to make the whole entry consistent.​—msh210 (talk) 09:12, 13 January 2012 (UTC)
My opinion: 1. whichever spelling comes first should stay, 2. spellings should be consistent within an entry, and 3. a definition tagged as {{US}} or {{British}} should use the respective spellings in the definition. -- Liliana 09:47, 13 January 2012 (UTC)
2 and 3 can conflict.​—msh210 (talk) 19:12, 13 January 2012 (UTC)
I just always use US spellings, I think their more internationally recognized and plus nobody can call me US biased, as I'm British, not American. Mglovesfun (talk) 10:25, 13 January 2012 (UTC)
I am just sitting on the fence. I have added this discussion to Wiktionary:American_or_British_Spelling, as I have found on other page where discussions of American and British spellings could suitably be listed. --Dan Polansky 12:35, 13 January 2012 (UTC)

If you think about it, it will become obvious that everyone should just use Canadian English, everywhere. Case closed. Michael Z. 2012-01-18 15:55 z

[edit] Narrower IPA, thinking about a vote

I think there should be a stricter policy on IPA. (Given that there isn't one which I'm plainly missing.) There was a vote on "using /ɹ/ at three words" and it passed. All its arguments can be applied to any other sound in any other word in any other language. Thinking from the viewpoint of a Wiktionary-user, not -member: "I want to know how to pronounce X. There's 'IPA: /X/'. What is IPA? [Opens Wikipedia, looks at the symbols. Comes to the conclusion that <X> is pronounced [X]." Now, we know there are narrow and broad transcription and we know that when we click on the IPA in a Wiktionary entry, a key opens. But the usual user doesn't. I don't want to remove all broad transcriptions, but what I want to propose is this:
The Broad IPA transcriptions should use the IPA-sign closest to the sound without any combining marks. Take English. there are dialects which speak /r/, there are dialects which speak something like /ɻʷ/. But both RP and GenAm have /ɹ/. And some 95% (random number, not a statistic) of English accents speak a sound which is far closer to /ɹ/ than to /r/. So why write /r/ anywhere but in a narrow transc. for Northumbria? We musn't make the IPA too broad, because in some languages we will end up merely copying the orthography, leaving the reader non the wiser. So no /r/ or /R/ for German, but /ʁ/. If you write /R/, add a|Austrian. No /sprɔːg/ for Danish but /sbʁɔːw/. And if it happens to be close to the narrow transcription, that is not a problem but merely a lucky coincidence. I always thought the purpose of IPA was not having to learn the whole phonology of a language. And with br. trans. such as /sprog/ I would simply end up pronouncing it utterly wrong.Dakhart 13:20, 13 January 2012 (UTC)

Umm sorry, but hasn't this been the consensus all along? I don't think you'll see /r/ used in any English or German entry here. -- Liliana 13:22, 13 January 2012 (UTC)
farm#Pronunciation, both Pron. and template. horse#Pronunciation, same. I already gave sprog#Danish as an example. So I strongly assume there's much more. Further, approx. every German entry I saw was Bavarian (e.g. Austrian). I just think it wouldn't harm to make it official and maybe add botting for it.Dakhart 13:29, 13 January 2012 (UTC)
I thought the policy here (at least de facto) was to use a broad transcription for English and a narrow one for other languages. That's why we put English transcriptions in slashes and other languages' transcriptions in square brackets. We can assume that users of the English Wiktionary have some knowledge of English and therefore know how the English r is pronounced. Using /ɹ/ would imply that precisely [ɹ] is the only possible realization of the English r phoneme, which it isn't; but using /r/ covers all existing realizations. It's long been the practice and policy of phoneticians, lexicographers and others using the IPA to use the typographically simplest symbols in broad transcriptions; that's why we use /iː/ rather than /ɪ̝j/ or something for the vowel of see. That's why every single English dictionary that uses IPA uses /r/ (Collins, COED, Longman Pronouncing Dictionary, Jones/Gimson, Kenyon & Knott, etc.) to render the English r sound, because they know that their readers are equipped with enough common sense to realize that /r/ stands for "the English r sound (however it may happen to be realized in the accent you're most familiar with)" in the context of an English-language dictionary and not necessarily for "voiced alveolar trill". —Angr 14:16, 13 January 2012 (UTC)
My opinion is that /ɹ/ should be encouraged, but not forced; using /ɹ/ instead of /r/ will give us no disadvantage, but since most sources use /r/ we can't consider it wrong. It would also be nice using /ɫ/ instead of /l/ in words like peel, and placing /ʰ/s where they exist. Ungoliant MMDCCLXIV 14:27, 13 January 2012 (UTC)
Only if we switch to using square brackets instead of slashes, and only if people then add an extra line to accommodate dialects where peel doesn't have [ɫ] (like Irish English); and then to be fair an extra line would also have to be added to words like leaf to show the dialects where it does have [ɫ] (like Scottish English and Australian English). Making our English transcriptions narrow seems to be an awful lot of work for zero benefit. —Angr 14:41, 13 January 2012 (UTC)
My opinion is that our policy should be compatible with verifiability. If the normal practice among linguists and lexicographers and so on is to write /a e i o u/ in discussing a certain language, then we should write /a e i o u/ even if phonetic realizations vary greatly depending on environment, because otherwise we're basically requiring original research: we won't even be able to take pronunciations from reliable sources. —RuakhTALK 14:43, 13 January 2012 (UTC)
1. I didn't say narrow, I said narrower. 2. What I was going to post (edit conflict): Well, according to this we don't use /r/, since the original intention of this vote clearly was the same as the one voiced by me now. Why he changed it, using example words instead, I do not know. I have seen some broad transcriptions in brackets on Wiktionary. There are proper narrow transcriptions for other languages, but there are also things like [d] for [̪d̪]. Further: Dictionaries give an explanation of their script used. Wiktionary does too, but as said: Only when an user happens to find it. On the other hand: Using /ɹ/ would imply that precisely [ɹ] is the only possible realization of the English r phoneme seems to be very strange a sentence to me since I always thought that using [ɹ] would imply that precisely [ɹ] is the only possible realisation of English <r>.
Most transcriptions for other languages (that I saw, naturally) are in slashes and I think that vowels are not a problem in them. /i:/ does depict the standard pron.s of "see" good enough, because "ee" is, at least for some part, a rather unrounded rather close front-vowel. But /zi:/ wouldn't. Because "S" is not a rather voiced consonant. And in the same vain <r> is not rather trilled. And the Danish <G> in "sprog" is neither velar nor a plosive in any way. The only advantage of such very broad transc. I can see is that they are more convenient for the author, but they bear more risk to mislead. And last but not least: I'm talking general policies, not English alone. To rephrase my proposal: "Let's use broad transcriptions for all IPA-entries, no matter what language, but use the IPA-sign that is closest to the nature of the actual sound used in the Standard given." That is: A velar sign for a velar sound, a trill sign for a trill sound, a /d/ for any sort of voiced-tongue-based stop etc. But not a trill for an approximant, not a velar stop for a labial approximant. And I think we won't have a problem finding a source that says that neither GenAm nor RP use a trilled R or a source saing that Dutch G is a /ɣ/ rather than a /g/. Which it isn't. I think no dialect has it but all dialects have /xχʝç/. Yet, /ɣ/ is broad enough but certainly narrower than /g/.Dakhart 14:52, 13 January 2012 (UTC)
I use /r/ because it's the most commonly used here and also the easiest to type. I will continue to do so until there is a consensus or a succeeded vote to do otherwise. Mglovesfun (talk) 17:23, 13 January 2012 (UTC)
There has been (for English).​—msh210 (talk) 19:21, 13 January 2012 (UTC)
To paraphrase my comment on the Tea Room, the vote doesn't really say what the voters are voting on. It only affects "words like red, green and orange". I have genuinely no idea what that's supposed to mean. Mglovesfun (talk) 19:36, 13 January 2012 (UTC)
No, it affects "the r phoneme in words like red, green and orange" (emphasis mine). Those three words exemplify English /r/. (I imagine they were chosen so as to give a diversity of phones; in GenAm, at least, red is typically pronounced with a retroflex /r/, green with a bunched /r/, and orange with a rhoticized vowel, though there is variation in all three. The point being that all three words supposedly have the same phoneme, just realized differently.) —RuakhTALK 21:22, 13 January 2012 (UTC)

And the underlying phoneme is /ɹ/, not /r/. To repeat/rephrase myself: IPA should give a broad transcription (slashes) unless somebody is really sure about a standard pronunciation, which then is given in narrow transcription (brackets). This should be because of how easy it is to make a wrong/nonstandard narrow transcription. And the broad transcription should give the IPA-sign for the phoneme occurring in most positions without combining signs. The underlying phonemes could easily be gathered from Wikipedia, which has sufficient sources for most languages. Such phones would be /ɹ/ for all <r>s, p.e. [ɹʷ], /l/ for [lˠ] (English), /ʁ/ for [ɐ̯] (Danish, German), /ɣ/ for /xcj.../ (Dutch), /g/ for [j] (Swedish), /d/ for [d̪ ð] (Spanish) and so forth. I gather that, while some would vote nay, nobody sees a reason not to vote on it. So I will wait two days for further input and then find out how to get the vote rolling.Dakhart 21:47, 13 January 2012 (UTC)

/g/ for [j] in Swedish would be a bad idea because there is a phonemic merger with /j/, it's more than just allophony. —CodeCat 22:01, 13 January 2012 (UTC)
I tripped upon that one too. But details are for later. The important thing is that no sign is used which represents a phone not existing within the language.Dakhart 22:31, 13 January 2012 (UTC)
Sorry, but the claim "the underlying phoneme is /ɹ/, not /r/" makes no sense. The underlying phoneme is simply a rhotic consonant; English only has one, so nothing else about it needs to be specified underlyingly. We write it /r/ instead of [+rhotic] (or [+sonorant, −nasal, −lateral] or whatever) because it's easier for humans to read. We write it /r/ instead of /ɹ/ for the same reason. Wiktionary already uses narrow transcriptions for languages other than English, so if you find a misleading broad transcription for Danish, just change it. It's a wiki. You don't have to discuss it or bring anything up for a vote to do that. If there's a vote on anything, it can only be about English, because English is the only language that would be changed by such a vote. —Angr 23:12, 13 January 2012 (UTC)

[edit] Non-lemma forms on rhymes page

Taking Rhymes:English:-ɪŋɪŋ as a typical example, there is a line that says <!--Do not add present participles or gerunds to this page unless they have other meanings-->. Um, why ever not? WT:Rhymes doesn't mention it, Wiktionary:ELE#Rhymes also does not mention it, am I right in thinking this isn't a consensus, but just one or more editors who wrote the invisible comments many years ago, and are therefore no longer relevant unless there is some evidence that this is still a consensus. Mglovesfun (talk) 17:02, 13 January 2012 (UTC)

  • Well, I don't think we could stop poets from using participles, gerunds &c as rhymes, so we should be able to include them in these pages if we want. Some of the pages could become ginormous mind you! SemperBlotto 17:07, 13 January 2012 (UTC)
    Rhymes:French:-e for one! Mglovesfun (talk) 17:21, 13 January 2012 (UTC)
    If we consider traditional rules for rhymes, this page should include only et, , ||ohé]], Noé, Pasiphaé, Aglaé, béer..., gréer..., agréer..., and a few others, but not words where there is a consonant sound before /e/. blé and thé are not considered as rhymes in French. Lmaltier 08:12, 14 January 2012 (UTC)
  • It might be hard to pull out lemma forms from the list if others are there, too; OTOH, I can't think of a good reason one might want to do so, so I'm with you unless someone comes up with one.​—msh210 (talk) 19:20, 13 January 2012 (UTC)
  • Do we waste more valuable resources (eg, contributor time, download time) in trying to enforce such limits or in having long lists of trivial Rhymes? DCDuring TALK 19:44, 13 January 2012 (UTC)
    It would be far better to auto-generate the rhyme lists based on pronunciations. A word would then be "added" to the rhymes page by simply giving it the correct pronunciation in IPA (or whatever other notation). Equinox 22:56, 15 January 2012 (UTC)
    That'd be difficult without the StringFunctions extension (which, seemingly, we're not getting) unless we change our IPA template to do something like {{IPA|lang=foo|nɑnˌɹɑjmɪŋg̚p|ɑɹt}}.​—msh210 (talk) 16:02, 17 January 2012 (UTC)
    Actually, we will be able to have templates manipulate strings as soon as we get Lua scripting available, but I'm not sure it would be a good idea to merge rhyme content and pronunciation content, since many users might not actually know how to use IPA, but do know that one word rhymes with another word and can thus be added using the "Add new rhyme" forms. --Yair rand 20:48, 15 February 2012 (UTC)
  • On a related note, forbidding non-lemma forms on Czech rhymes pages makes no sense to me, as, in Czech, it is the particular inflected form that has to rhyme. --Dan Polansky 20:42, 13 January 2012 (UTC)
    Yup, also for Icelandic. What with vowel changes and all manner of irregular forms (which occur to a lesser extent in English as well), non-lemma forms need to be listed as well. This is what I've always done for the Icelandic rhymes. – Krun 22:10, 13 January 2012 (UTC)
As far as I know, the restiction against non-lemmata was instituted at the start of the Rhymes project, and has never been discussed as far as its value. I think that, given the current state of thinks on Wiktionary, inclusion of non-lemmata should be allowed and comments forbidding their inclusion be removed. --EncycloPetey 02:21, 17 January 2012 (UTC)
I agree. --Yair rand 20:48, 15 February 2012 (UTC)

Like the vote says. Someone may want to put something in to reflect what Yair rand says about different spacing when different dialects are involved. There are still 7 days to edit the vote. Mglovesfun (talk) 17:20, 13 January 2012 (UTC)

[edit] Why considering the number of syllables for rhymes?

This seems to be quite irrelevant. What would be most helpful is an order of rhymes according to the richness of the rhymes, i.e. in a kind of reverse phonetical order (giving priority to vowels): e.g. ringing should be near stringing because of the common ringing, making them closer of each other than pinging. Lmaltier 08:25, 14 January 2012 (UTC)

If you're writing a poem, the number of syllables could be rather important — unless I'm missing something. Equinox 22:55, 15 January 2012 (UTC)
Of course, but the number of syllables of the verse, not of the last word of the verse. I now understand that this can be useful when the verse is almost complete and you try to find the last word. But in most cases, you look for a rhyme much before that, and the richness of the rhyme is something important. Lmaltier 17:34, 17 January 2012 (UTC)
This may depend on language.​—msh210 (talk) 17:45, 17 January 2012 (UTC)
"Richness" wouldn't be useful for English rhymes; in English, "ringing", "pinging", and "stringing" all rhyme to the same extent. —RuakhTALK 17:44, 17 January 2012 (UTC)
You are right: this depends on languages (see w:Rhyme). My suggestion does not apply to English, but it applies to French (and probably to some other languages). Lmaltier 18:30, 17 January 2012 (UTC)

[edit] Announcing Wikipedia 1.19 beta

Wikimedia Foundation is getting ready to push out 1.19 to all the WMF-hosted wikis. As we finish wrapping up our code review, you can test the new version right now on beta.wmflabs.org. For more information, please read the release notes or the start of the final announcement.

The following are the areas that you will probably be most interested in:

  • Faster loading of javascript files makes dependency tracking more important.
  • New common*.css files usable by skins instead of having to copy piles of generic styles from MonoBook or Vector's css.
  • The default user signature now contains a talk link in addition to the user link.
  • Searching blocked usernames in block log is now clearer.
  • Better timezone recognition in user preferences.
  • Improved diff readability for colorblind people.
  • The interwiki links table can now be accessed also when the interwiki cache is used (used in the API and the Interwiki extension).
  • More gender support (for instance in logs and user lists).
  • Language converter improved, e.g. it now works depending on the page content language.
  • Time and number-formatting magic words also now depend on the page content language.
  • Bidirectional support further improved after 1.18.

Report any problems on the labs beta wiki and we'll work to address them before they software is released to the production wikis.

Note that this cluster does have SUL but it is not integrated with SUL in production, so you'll need to create another account. You should avoid using the same password as you use here. — Global message delivery 00:06, 15 January 2012 (UTC)

[edit] Wikipedia blackout

For those who weren't already aware, the English and German Wikipedias will be "blacked out" tomorrow (the 18th) in protest of impending US legislation. The Main page will be replaced with a blackout banner, and editing will be locked for the duration of the protest. See WP:SOPA for more information. Commons may be displaying a banner, but does not appear to be planning to lock down. --EncycloPetey 02:17, 17 January 2012 (UTC)

At the Dutch wiktionary a banner is flying in solidarity and there are discussions elswhere Will the English Wiktionary consider the same?Jcwf 04:10, 17 January 2012 (UTC)
It's a bit late now to gain any meaningful consensus for it.​—msh210 (talk) 15:57, 17 January 2012 (UTC)
I wouldn't have thought so. SemperBlotto 08:41, 17 January 2012 (UTC)
I predict that we will get 999 angry comments saying "I HATE U WIKIPEDIA, U STOPPED ME DOING MY HOMEWORK". Equinox 23:54, 17 January 2012 (UTC)
Yeah, I predict that despite the blackout being Wikipedia-only, we (Wiktionary) will get at least a few such angry comments. Because, you know, people won't be able to leave them on Wikipedia during the blackout. Phol 01:27, 18 January 2012 (UTC)
Or more probably because people genuinely can't tell Wikipedia and Wiktionary apart. Look at the pathetic specimens we get on the feedback page. Equinox 01:34, 18 January 2012 (UTC)
And for those who haven't realized, a WP blocks out a (very) short time after loading, so stopping the page's loading will allow it to be displayed.​—msh210 (talk) 16:14, 18 January 2012 (UTC)
Yes, my internet connection is so slow that it took me a while to realise that the Java script was supposed to be blocking pages. If you really want to read Wikipedia, just disable Java in your browser. Dbfirs 17:10, 18 January 2012 (UTC)
No, just disable Javascript. Javascript and Java are completely different things. --Yair rand 21:53, 18 January 2012 (UTC)
Sorry, yes, my mistake! Dbfirs 23:00, 18 January 2012 (UTC)
Or simply right click->View page source :). JamesjiaoTC 22:19, 18 January 2012 (UTC)
m:English Wikipedia SOPA blackout/Technical FAQ#Are there ways to circumvent the read blackout? The page lists several.​—msh210 (talk) 22:23, 18 January 2012 (UTC)
Adding ?banner=none or &banner=none to the end of the address works too. —CodeCat 22:24, 18 January 2012 (UTC)
Or just pressing the browser's "stop" button before the page finished loading... --Yair rand 22:25, 18 January 2012 (UTC)

[edit] The 'definition' of non-English place names

Our current practice for non-english place names is to give them a definition in English, and to create a link to the English entry in the non-English entry, with the proper translation into English. This is our practice for regular words as well so it's not really that strange. But with place names it often seems backwards. In many cases, the English 'translation' is the same word, as it was simply loaned from the place of origin into English. For example, Catalan Girona is simply defined as 'Girona', with a link to the English section, even though the city is in Catalonia. And the same way for Dutch Eindhoven, Indonesian Jakarta and so on. I'm not quite sure what would be a better way to display this, but it seems strange to me that the main definition is in the English section when the name is clearly native to another language. —CodeCat 18:56, 17 January 2012 (UTC)

So you think the English definition should be "English name of Jakarta" or "English name of ירושלים? That sounds reasonable, but IMO the following four reasons for doing it the way we've been doing it win out: (1) Consistency with non-proper-noun entries. (2) The lack of desire to get into a fight over which name should be chosen as the primary one, linked to in all the definitions, when more than one language-speaking group lays claim to a place. (3) The primacy of English-language entries: they shouldn't rely on other-language entries for their definitions. (4) Readability: an English-language definition should not include foreign-language words.​—msh210 (talk) 20:03, 17 January 2012 (UTC)
I agree with Msh210. Let's keep to simple principles. But the discussion was not about English entries, and I understand CodeCat's concern. I think that, in such cases, the definition in the non-English sections could be written as: [[Jakarta#English|Jakarta]] (the capital city of Indonesia). Lmaltier 20:48, 17 January 2012 (UTC)
Sure, {{gloss}} is always good to use.​—msh210 (talk) 22:38, 17 January 2012 (UTC)
I think the status quo is the best practice. In addition to Msh210's arguments above, doing it this way also allows consistency with entries for place names where the native name is spelled differently from the English name, so München#German is defined as Munich, and Praha#Czech is defined as Prague, while the meaningful definitions are at the English names. Using {{gloss}} is only necessary if the English entry has more than one meaning, and the native entry corresponds to only of those meanings. Thus, if at Prague we have "1. The capital city of the Czech Republic" and "2. A town in Lincoln County, Oklahoma", then Praha#Czech should be say "(the capital city of the Czech Republic)" so readers know that the town in Oklahoma is not also called Praha in Czech. —Angr 23:17, 17 January 2012 (UTC)
New senses can be added at any moment. It's not always necessary to add a gloss in the non-English word definition, but if you want to add it (just in case), it's never bad, as it might become necessary some day. And, even when unnecessary, it might help some readers. This is true for all words, of course, not only placenames. Lmaltier 18:20, 18 January 2012 (UTC)
FWIW, I agree with both of you, Angr and Lmaltier: {{gloss}} is necessary only when there's more than one definition but sometimes helps (and never hurts) even otherwise.​—msh210 (talk) 22:28, 18 January 2012 (UTC)

[edit] Radio shorthand and other codes, is it translingual?

In radio communication, there are many shorthands such as SOS (emergency), CQ (calling all stations), 73 (best regards), as well as the Q codes such as QSL (reception report). These are used internationally, and as far as I've been able to tell they're used in other languages as well as English. But as English has had a leading role in international radio communications, I'm not quite sure whether these terms are translingual or not. What category would be best for such terms, given that they are a kind of 'translingual radio slang'? —CodeCat 18:05, 18 January 2012 (UTC)

Well I see them used a lot in German running text, so it's safe to assume they're translingual. -- Liliana 19:48, 18 January 2012 (UTC)
I think they are translingual, but this fact does not exclude additional sections for several languages (with prononciation, examples showing how it is used in the language, etc.), even if these sections seem much less useful fot these codes than for other translingual terms (scientific names in biology, etc.) Lmaltier 20:12, 18 January 2012 (UTC)

[edit] Irony and sarcasm

Currently, {{ironic}} redirects to {{sarcastic}}. I submit that this should be the other way around. 'Sarcastic' is far too restrictive a word for how virtually all the terms in Category:English sarcastic terms are used. Ƿidsiþ 17:37, 21 January 2012 (UTC)

In school I was told that sarcasm is a type of irony, so I agree. Ungoliant MMDCCLXIV 18:20, 21 January 2012 (UTC)
Sarcasm is often used to mean "verbal irony", but that's often considered a misuse. The OED defines sarcasm as "A sharp, bitter, or cutting expression or remark; a bitter gibe or taunt. Now usually in generalized sense: Sarcastic language; sarcastic meaning or purpose" and irony (in the relevant sense) as "A figure of speech in which the intended meaning is the opposite of that expressed by the words used; usually taking the form of sarcasm or ridicule in which laudatory expressions are used to imply condemnation or contempt." Properly speaking, neither is a subset of the other; something like "Good going; wanna break anything else while you're at it?" is both, but something like "You suck at this" is only sarcasm (not irony), and "Nice weather, huh? I love trudging through knee-deep snowdrifts" is only irony (not sarcasm). Some of the terms in Category:English sarcastic terms do not seem ironic to me, only sarcastic; what's ironic about no duh? —RuakhTALK 22:50, 23 January 2012 (UTC)

[edit] Renaming requests for verification

I am in the process of creating Wiktionary:Votes/2012-01/Renaming requests for verification, which proposes to rename WT:Requests for verification to WT:Requests for attestation. Feel free to discuss the proposal here or on the vote's talk page, as you see fit. Feel free to postpone the vote should the discussion last longer than until the start of the vote.

Most recent relating discussion: Wiktionary:Requests_for_moves,_mergers_and_splits#Wiktionary:Requests for verification to Wiktionary:Requests for attestation, March 2011. --Dan Polansky 13:48, 22 January 2012 (UTC)

Responding to one of the arguments made in the previous discussion: 'Whatever we call the page, we will need to explain it to new users/contributors. "Verification" is 20 times more common in English than "attestation". [...] Consequently, Oppose. DCDuring TALK 00:30, 28 March 2011 (UTC)': "verification" is misleading, so its being common does not save it. The term "attestation" is used by CFI, and it is "attestation" as defined by CFI that is being sought at the page currently called "WT:Requests for verification". --Dan Polansky 13:54, 22 January 2012 (UTC)
I strongly prefer Wiktionary:Please read the prologue of this page to see what it's all about It's so far the only proposed name that makes it clear what is going on in there. -- Liliana 05:31, 23 January 2012 (UTC)
This seems to be made in joke, or as a sarcastic argument. For the latter case: the jocularly proposed page name does not tell the user at all what the page is about. Actually, all pages in Wiktionary namespace could have this name. The name with "attestation" is not significantly longer than "verification", so the implication in that jocular argument that the renaming is going to make page names needlessly long is wrong. Another way of reading this sarcastic remark is as saying this: page names in Wiktionary namespace don't matter, as everyone can read the top of the page anyway. By contrast, I find clear and fitting page names a good thing, regardless of the option to read the top of the page. Curiously, the top of the page has to say that 'Requests for verification is a page for requests for attestation of a term or a sense, [...]'. When a newbie sees this sentence, the natural response would often be like "if this page is for requests for attestation, why the heck is it called requests for verification"? --Dan Polansky 07:56, 23 January 2012 (UTC)
Or "I don't know what attestation is, but from the page title, I guess it just means "verification." This easier-to-understand name is a poor choice, because it's actually just easier to misunderstand. Michael Z. 2012-01-30 22:01 z

[edit] Please help with sorting out unknown language names

Sometimes people request translations and such for languages that we don't have a code for on Wiktionary. I've modified {{ttbc}} and {{trreq}} temporarily to add any language names it doesn't recognise to Category:CodeCat's test category. Could everyone please help empty that category again, by replacing the parameter of those templates with the proper code? Thank you! —CodeCat 21:20, 22 January 2012 (UTC)

I've fixed one of them, and its problem was that it used {{ttbc|[[languagename]]}}. Whoever fixes others, can you state whether that was the problem also? If so, perhaps we should adjust {{ttbc}} to allow for such use.​—msh210 (talk) 03:59, 23 January 2012 (UTC)
That one was [[pander]]. Same thing at [[illness]].​—msh210 (talk) 18:36, 23 January 2012 (UTC)
At [[safety]], the problem seems to be that the entry contains {{ttbc|Visaya}}, and we don't have Visaya as a language (in fact, it seems not to be one). But it is a language family, and that seems like an appropriate use of {{ttbc}}. Perhaps the template should allow for such use (by language-family code if not by name)?​—msh210 (talk) 18:27, 23 January 2012 (UTC)
Similar issue at [[bone]]: it uses {{ttbc|Old Mongolian}} and {{ttbc|Middle Turkish}}, and we have neither language. Again, I didn't remove these, as I don't know them to be nonexistent: maybe we just need to add the languages. (See also w:Middle Turkic languages and w:Middle Mongolian language.)​—msh210 (talk) 18:36, 23 January 2012 (UTC)
As Wikipedia says, there's no language "Old Mongolian", as the first written sources appeared only in 12th century. We have {{xng}} and {{cmg}} though. Not sure what to do about Middle Turkish. -- Liliana 17:11, 24 January 2012 (UTC)
I've brought the number down to three. I'm not sure what to do with the remainder though. The problems with bone have already been mentioned, and sinew also mentions 'Middle Turkish'. octillion uses 'Chinese numeral' as a language, I'm not sure what that's supposed to be. —CodeCat 21:11, 23 January 2012 (UTC)
Mglovesfun has fixed octillion. I've removed the Old Mongolian from bone because it didn't seem to be correct or in a correct script; as long as I was at it, I removed the Middle Turkish (which it was oddly subordinated to), too. That leaves sinew. - -sche (discuss) 03:42, 30 January 2012 (UTC)

[edit] Internet =/= Internet slang

Last time I checked, these contexts worked like this:

However, a lot of words in Category:en:Internet are Internet slang instead. (epic fail, a/s/l, BTW...) I can recategorize them, but I'd like to make the distinction clear first.

(Standard disclaimer: But feel free to propose different things.)

Hi, Wiktionary.

--Daniel 09:27, 24 January 2012 (UTC)

Things like IP and hyperlink aren't necessarily Internet related; they occur in a network as well. Those should be {{networking}}. -- Liliana 16:56, 24 January 2012 (UTC)
Are you saying the 'Internet Protocol' is not just for the Internet? —CodeCat 17:00, 24 January 2012 (UTC)
How do you expect a modern network to function without IPs? NetBEUI and IPX/SPX are obsolete nowadays. -- Liliana 17:59, 24 January 2012 (UTC)
That's right: one can set up an IP network which is not connected to the Internet. —AugPi (t) 18:21, 24 January 2012 (UTC)
  • This would seem to be a problem in the way context information is used to populate topical categories. Topical categories and usage contexts overlap, but neither is a subset of the other. Perhaps the remedy is either to not use contexts to populate categories or to allow individual contexts to be marked in such a way as to override the default categorization. The general answer would seem to be that topical categorization should be distinct from usage contexts, a point MZajac made years ago. DCDuring TALK 19:33, 24 January 2012 (UTC)
    • I agree. It would be nice if there were a separate {{topic}} template. But it would also mean that we would have to make a distinction between {{topic|Internet}} and {{context|Internet}}, because they can't both use {{Internet}} as the underlying template... —CodeCat 19:39, 24 January 2012 (UTC)
      We could allow the context to have priority, especially as there is much less subjectivity and arbitrariness and more linguistic content to usage contexts. Topical categories have always seemed much more arbitrary to me. And, as we would not in general have sense marking for topical categories if we make the context-topic distinction, it would not be clear which sense accounted for the headword being in the category. DCDuring TALK 19:58, 24 January 2012 (UTC)
    • Re "Topical categories and usage contexts overlap, but neither is a subset of the other. Perhaps the remedy is either to not use contexts to populate categories...": We've already decided on that remedy. Alas, it'snot yet implemented as widely as it should be.​—msh210 (talk) 06:04, 25 January 2012 (UTC)
  • If we're going to use (networking) instead of (Internet) as the context of IP because technically there are instances of IPs existing without Internet... We may as well use (hypertext) instead of (Internet) as the context of web page, hyperlink, splash page, pop-under, frameset, because technically we can view these things in offline hypertext pages. --Daniel 08:35, 25 January 2012 (UTC)
    • Internet Protocol (IP) not only can be used outside of the Internet, it frequently is, for example it's used even for communicating between processes on a single machine, for local area networks, and increasingly with peripheral devices. I'm not sure if you're just trying to make a point about pedanticism, but regardless it's a fair point that "offline" hypertext pages exist, so I'll address it in good faith. Some of those words could sense offline or within a broader context, for example "hyperlink" and "frameset" could be considered "networking" or "computing" terms, rather than Internet-specific. But "web page", "splash page", and "pop-under" all imply Internet. A pop-under, for example, makes little sense offline, even if it's technically possible, and "web" in "web page" is for "world wide web", part of the Internet. TL;DR: I strongly suggest IP be considered "networking" rather than "Internet" (it's not just being pedantic, it's how it's commonly used), and if you want to broaden the scope of some "Internet" terms to be "networking" or "computing" that seems fair enough to me but they should be considered on a case-by-case basis. Pengo 12:19, 27 January 2012 (UTC)
    • I think Daniel makes a good point, and I mostly disagree with you, Pengo. A frameset has nothing to do with networking (the connection of multiple computers); it is only part of a hypertext document; it just so happens that we see most of our hypertext on Web pages that come over a network, but they don't have to, and sometimes don't — so "Internet" (relevant context) is a more reasonable tag for frameset than "networking" (irrelevant context). Likewise, a pop-under can certainly exist offline and make sense, e.g. when developers are testing their sites. Equinox 23:53, 27 January 2012 (UTC)
  • You seem to "mostly disagree" with only two examples (and one of them due to a misunderstanding). My overall point was that the context labels should be considered on a case-by-case basis and that IP is definitely networking and not Internet, and I don't seem to be disagreeing with that. Sorry, I stated the frameset example ambiguously. I meant it could be considered "computing" (and that "hyperlink" could be considered "networking" or "computing"). As for pop-under, testing a pop-under offline is still testing it for the Internet. Like I said, a pop-under makes little sense outside of the context of the Internet, even if one could technically exist offline, so I'd consider it extremely pedantic to broaden its context. You can disagree if you like, I'm not really so worried about how it ends up or if it has the context/topic removed. Pengo 00:38, 28 January 2012 (UTC)
Thanks for the permission! Equinox 00:48, 28 January 2012 (UTC)
No, no, no! Don't apply labels based on facts about the referrent! If they contribute to the definition, then they belong in the definition. Don't label something internet or computing based on whether the thing works online or offline. You don't label the definition of bear with (woods). Nor should you label each sense just to help the reader discriminate each item in a long entry. This confusion is why "context" is such a poor name for these labels.
A usage label is applied only based on by whom and where the term is used.
Everybody knows what a web address is – don't label it. Internet Protocol is a technical term in computing and networking, but anyone who operates a web browser or other networked software might benefit from knowing what an IP address is: I'd be tempted to label it with the more general computing. Hyperlink predates the WWW, and is a concept in various media, including writing, multimedia CD-ROMS and computer software interfaces; we now find hyperlinks in all of our apps and ebooks. I don't think it is technical or restricted enough to warrant a label, or at most computing. Image map seems to occur in books on web design and graphics, but not in web users' how-to books: label it web design or web authoring. I see that splash page appears in books about web authoring and marketing, so perhaps label it with both. —This unsigned comment was added by Mzajac (talkcontribs).

Let's not overuse these lexicographical restricted-usage labels. Web page, for example, is not jargon or restricted to specialized lexical contexts, and shouldn't be labelled as such.

For "topical" categorization (although I can't understand why we would try to duplicate Wikipedia in categorizing the referrents of terms), what is wrong with typing [[category:Internet]] at the end of a definition line? Michael Z. 2012-01-27 15:29 z

There is a lot of overuse of the context labels to clean up -- and a need for the advocates of topical categories to actually hard-code topical categories. And the default use of the contexts to include entries in topical categories should end, as plenty of time has passed to allow for the hard-coding to categories for those entries with misused context labels. Appropriate context labels are a useful guide for the insertion of hard categories using AWB or some fully automated approach. DCDuring TALK 18:49, 27 January 2012 (UTC)
We should refurbish the nomenclature, which is vague and encourages misuse. Our "context" has no useful meaning, and should be replaced with restricted-usage labels, or usage labels for short. "Topical context labels" are not for identifying the topical context of a sense – they're restricted-usage labels for technical or specialized terms – perhaps these should be called technical or subject usage labels. {{context}} can be renamed {{usage}}, which is practically unused, or {{label}}. "Grammatical context" labels have nothing to do with context, and should be regarded separately as grammatical labels.
See category:Context labelsMichael Z. 2012-01-27 23:44 z

[edit] Dinosaurs.

If anyone is interested, I have copied over a list of dinosaur names from Wikipedia, containing over 1,300 names - all blue links at 'pedia, but mostly red links here. The list is at User:BD2412/walk the dinosaurs, though I won't object if others want to move it to project space or otherwise rename it. I don't see myself getting back to this for a while, but please have at it. Cheers! bd2412 T 19:31, 26 January 2012 (UTC)

Wow this is incredibly useful, thanks for creating this valuable page, I'll try to find some time to look it over. -- Cirt (talk) 23:59, 13 February 2012 (UTC)

[edit] When to use the gerund tag

I just discovered Appendix:Glossary#gerund. The languages I work on (gml, de, nds) genuinely treat gerunds as nouns. (confer Leben, lęvend). Would the right thing to do be, to add the gerund tag in front of those nouns?Dakhart 14:32, 28 January 2012 (UTC)

That depends on the language. English treats gerunds as nouns, but we only list them as nouns when the term has taken on strongly noun-like characteristics that warrant a separate definition. Otherwise, we simply label English gerunds as "Verb" since they are also a present participle form. However, for Latin gerunds we have a separate "Gerund" part of speech, since Latin gerunds do not behave fully like nouns. Among other differences, they have no nominative and no plural, for example, and have a modified conjugation table as a result. As a result, Latin gerunds are not treated in the same way as English gerunds. What you do depends on the languages you're looking at. I don't known enough about gerunds in German to offer any more specific advice. --EncycloPetey 16:12, 28 January 2012 (UTC)
In Italian, we use "Verb" as the section name, and use {{gerund of}} (with "lang=it" in the definition line. SemperBlotto 16:20, 28 January 2012 (UTC)

[edit] 7 Wonders

Much in the way that we have kept from listing specific people by first and last name, I would propose that we not include the place name for specific entities that otherwise warrant inclusion unless the place name is integral to the name of the entity. The Seven Wonders of the Ancient World will be used to illustrate this idea, assuming that we might all consider these to be permissible dictionary entries under some title. I would permit:

In some cases the full name is required:

Can we agree to allow these entries under the suggested titles? DAVilla 03:11, 30 January 2012 (UTC)

  1. Symbol oppose vote.svg Oppose Liliana 03:17, 30 January 2012 (UTC)
  2. Oppose also, don't include them. WT:NOT#Wiktionary is not Wikipedia. Mglovesfun (talk) 11:40, 30 January 2012 (UTC)
    This is my feeling too. Equinox 17:52, 30 January 2012 (UTC)
    To clarify, if a term has no linguistic merit, don't include it because it is well known, or whatever. Mglovesfun (talk) 16:33, 31 January 2012 (UTC)
    I never said they should be included because they're well known. Rather, I had assumed that they all have linguistic merit. Worse, I assumed you all realized this, but having been challenged, there's no reason to think this would not still have to be proven. Yet your reflexive denial of their linguistic merit is a pathetic stubbornness that seeks to separate encyclopedic terms from language constructs despite the myriad of such names that have been individually scrutinized and passed and the myriad of encyclopedic titles that are nonetheless English words. In a more hypothetical construction than the concrete case I've laid out, your denial of the antecedent would not stand. But far be it from me to argue with an exclusionist about the addition of language that would aid your cause rather than include any terms beyond these seven, which I promise you cannot remain red indefinitely for the force of evidence in their favor. DAVilla 17:19, 5 February 2012 (UTC)
  3. Oppose, but not for Mg's reason. We're not an encyclopedia, so we shouldn't be discussing which referents we should include words for but, rather, which words we should include. That is, Statue of Zeus and Statue of Zeus at Olympia are two different words (if you will) and each gets included, or not, on its own merits. There's no cause at all to say "we should include one of them, so let's decide which title is better": that's the purview of an encyclopedia. (Plus, I suspect none of these should be included at all, as Mg alludes to, but that's another issue and not my point here.)​—msh210 (talk) 17:14, 30 January 2012 (UTC)
  4. Oppose per Mglovesfun and msh210 (and maybe Liliana as well). —RuakhTALK 17:50, 30 January 2012 (UTC)
  5. Please, no. DCDuring TALK 18:15, 30 January 2012 (UTC)
  6. Oppose  hanging (lemma: hang) and gardens (lemma: garden) are dictionary terms: lexical units with inherent meaning. hanging gardens is merely a sum-of-parts phrase, deriving meaning from its component terms, and I hope we can all agree it doesn't belong in the dictionary. Capitalizing it Hanging Gardens signals it as a name or title (denoting a Toronto restaurant, among many other things) but again, this is not a lexical unit with unique meaning, and doesn't belong in the dictionary. Ditto for Hanging Gardens of Babylon, but because it is widely used to refer to one particularly famous thing, many editors will argue to keep it. Encyclopedic entries like this just duplicate Wikipedia, very poorly. I say delete them all, or redirect them to Wikipedia, and concentrate on being the best possible dictionary. Michael Z. 2012-01-30 21:55 z

I have started working on this category. About a quarter or more of the requests are for non-English citations. (See non-Roman character entries, eg here, but also various Esperanto entries.) Do we not need to have subcategories for this by language, at least for languages other than English?

Category membership comes almost entirely from {{rfdate}} and templates like {{quote-book}} with the "year" parameter omitted. It is a simple matter to add lang= to rfdate, though it does not now categorize by language. Should we not do this and also add a lang= categorization capability for templates like {{quote-book}}? DCDuring TALK 14:32, 31 January 2012 (UTC)

A user has been adding entries for Old Javanese. I've been removing them purely because there's no code for it, I know just about nothing about Javanese, but we do for example have Category:Old Swedish language with the ad hoc code {{gmq-osw}}. Should Old Javanese be permitted a code? NB I would interpret a lack of objections as 'go ahead'. Mglovesfun (talk) 16:32, 31 January 2012 (UTC)

Old Javanese does have a code: {{kaw}}. -- Liliana 16:38, 31 January 2012 (UTC)
It displays Kawi, do we want to change it to Old Javanese? Mglovesfun (talk) 17:08, 31 January 2012 (UTC)
I think Old Javanese would be better for consistency with other languages. -- Liliana 17:09, 31 January 2012 (UTC)
I've edited {{kaw}} and restored the two Old Javanese entries I removed. Mglovesfun (talk) 11:40, 1 February 2012 (UTC)
In Java, we didn't call them "Old Javanese" (Indonesian: bahasa Jawa Kuno), because that would imply something different; instead the name "Kawi" (Indonesian: bahasa Kawi) is more appropriate. Bennylin 11:09, 6 February 2012 (UTC)

[edit] Names of languages in their own language (several questions)

We have a French entry for français, an Italian entry for italiano etc, and I have just added a Javanese entry for Basa Jawa. I think that we really ought to have an entry for every language in its own language.

Is there an easy way of finding out which ones are missing?

Shouldn't they all be simple nouns (uncountable), not proper nouns?

Should they all be uncapitalised (if written in an alphabet)? SemperBlotto 17:06, 31 January 2012 (UTC)

Probably no to all of the last three; English would be an exception to both #3 and #4. Mglovesfun (talk) 17:29, 31 January 2012 (UTC)
Appendix:ISO 639-1? The terms are unlinked, but it should not be hard to link them all. -- Liliana 17:30, 31 January 2012 (UTC)
Done, though some of them might be SOP. —RuakhTALK 17:53, 31 January 2012 (UTC)
And capitalization seems to be incorrect in some cases: the page lists Italiano.​—msh210 (talk) 23:48, 31 January 2012 (UTC)
Capitalization is a function of the rules of whatever language the word is occurring in. The German word for the German language - Deutsch - is properly capitalized, as are all language names in German. bd2412 T 17:52, 31 January 2012 (UTC)
[3] and [4] have local names for many languages, but capitalisation is an issue. Ungoliant MMDCCLXIV 21:14, 31 January 2012 (UTC)
No, these will not all be nouns. Language names in some languages, like Latin and Slovene, are usually adverbs or adjectives. --EncycloPetey 02:21, 2 February 2012 (UTC)

[edit] Which form of a letter is lemmatised: the majuscule or the minuscule?

By which I mean, if I define K, k, do I put the information that concerns both forms of the letter at K or at k? And does it matter in which language is the letter that I'm treating? I ask because, for the members of Category:la:Letter names of the Roman alphabet, for example , should it be defined as "The name of the letter k." (as it is currently) or as "The name of the letter K."? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 23:11, 31 January 2012 (UTC)

For Latin itself it should probably be the capitals, because that's all the Romans used. And I think for the sake of convenience, as well as common practice, it should be the same for other languages too. —CodeCat 23:35, 31 January 2012 (UTC)
Agreed. I've modified the entries for the fourteen members of Category:la:Letter names of the Roman alphabet accordingly. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 23:47, 31 January 2012 (UTC)
It contradicts our policy of using lowercase, though. As well, there are many more languages which use lowercase only than ones who use uppercase only. -- Liliana 11:28, 1 February 2012 (UTC)
Could you provide a link to that policy, please? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 13:55, 1 February 2012 (UTC)
If you say that there is no such policy, feel free to move free to Free and dictionary to Dictionary, in that case! -- Liliana 15:55, 1 February 2012 (UTC)
I think the policy doesn't concern single letters, though, any more than it concerns acronyms... —CodeCat 16:02, 1 February 2012 (UTC)
Where's the difference between words and individual letters? (It applies to acronyms too, but those are *usually* written in all uppercase, so they're okay) -- Liliana 16:10, 1 February 2012 (UTC)
The difference is in English usage. We use caps to give letters their own identity, whether standing alone or strung together, while in lowercase they are subsumed into words. E.g., Nasa is a word (nah-saw), but in ISO the letters remain letters (aye ess oh). Michael Z. 2012-02-01 18:11 z
That principle clearly doesn't apply to individual letters — Free and Dictionary are red-linked as standard, the majuscule forms of letters are never red-linked as standard. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 16:04, 1 February 2012 (UTC)
As for Latin. The Romans used capital letters when hammering them into stone, but used a lowercase script when writing with a stylus. And anyway, the Latin language outlived the ancient Romans. SemperBlotto 16:12, 1 February 2012 (UTC)
Not to mention Latin is still official in Vatican City. -- Liliana 16:17, 1 February 2012 (UTC)

In English, minuscule is the default case used in running text, while capitalization is used for letter emphasis. However, the majuscule is the basic historical and stereotypical form of each letter, the first form of learners, the one used in indexes, and the most common one used for letters in isolation, and in abbreviations where the letters stand for themselves. it seems sensible to lemmatize the majuscule. Michael Z. 2012-02-01 16:16 z

Like Ruakh, I'd prefer to lemmatise both. For example:
  1. A: "majuscule form of a, the first letter of the basic modern Roman alphabet" or "the first letter of the basic modern Roman alphabet (minuscule form: a)"
  2. a: "minuscule form of A, the first letter of the basic modern Roman alphabet" or "the first letter of the basic modern Roman alphabet (majuscule form: A)"
I expect that at a minimum, if we lemmatise only one, e.g. A, we must include a definition line in a "minuscule form of A".
I (would/do) similarly oppose having some sense lines at e.g. a British spelling like colour but not at color because Americans don't use the word in those ways: it may be true that only one spelling has the sense, but it's confusing. Let usage notes and context and qualifier tags clarify that certain senses are generally used in one place or another, and thus in one spelling or another. Both A and a are the first letter of the alphabet, in addition to A being an ampere and a being a year, so I'd like the letter-ness mentioned in both places, A and a. - -sche (discuss) 20:15, 1 February 2012 (UTC)
I agree with your suggestions, but I think it's inaccurate to say that "only one spelling has the sense." The term has senses, and spellings, and some of them are used mainly in certain places, times, situations, or media. For exmple, in Canadian English (a branch of "American English," historically), the term is mainly spelled colour, but also color, and it may share senses with either or both British and US usage. This is why we should lemmatize the term, and not any spellings or capitalizations.
It's incorrect and misleading to treat colo(u)r as two different words. We lemmatized spellings and capitalizations just because MediWiki software lets us. We need a better guideline to help us define and lemmatized terms as lexical units. Michael Z. 2012-02-03 16:36 z

[edit] Which form of a letter is lemmatised: the majuscule or the minuscule? — Straw poll!

Scope: The Roman, Greek, and Cyrillic alphabets.

I support lemmatising the majuscule forms of letters
  1. Symbol support vote.svg Support — Raifʻhār Doremítzwr ~ (U · T · C) ~ 16:33, 1 February 2012 (UTC)
    There's a problem to lemmatising minuscules in some cases — in the case of the Greek sigma, there is only one majuscule form, viz. Σ, whilst there are two minuscule forms, viz. σ and ς; which of those forms should be lemmatised, if we decide to lemmatise letters' minuscule forms? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 18:21, 1 February 2012 (UTC)
    In that case, clearly σ. As a general rule — well, majuscules might have the same problem. —RuakhTALK 18:43, 1 February 2012 (UTC)
    Why "clearly"? For me, lemmatising ς seems the intuitive choice, by analogy with choosing s over ſ. Also, which majuscules, if any, have the same problem? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 19:27, 1 February 2012 (UTC)
    I think I disbelieve your claim that you'd rather lemmatize [[s]] than [[ſ]]. ;-)   —RuakhTALK 23:41, 1 February 2012 (UTC)
    All kidding aside, I think that (if we lemmatise minuscules) it would make more sense to lemmatise the terminal forms, rather than the medial forms. The terminal form is the form the letter would take in isolation, because the medial form is only used when it is followed by other letters in the same word. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 23:49, 1 February 2012 (UTC)
    No, as I understand it, σ is the form used in isolation, with ς only being used at the end of a word. (And one reason that Unicode gives them separate code-points is that they can't be distinguished algorithmically, because there are abbreviations that end with σ, but I don't know if that's the exception or the rule.) By the way, in English, even when ſ was in use, I think that s was the default form, though now that the question is raised I suppose I'm not sure of that. —RuakhTALK 00:24, 2 February 2012 (UTC)
    OK. Well, if you're right that "σ is the form used in isolation", then that is the form that we ought to lemmatise, if we were to decide to lemmatise letters' minuscule forms. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 15:26, 2 February 2012 (UTC)
    There's one problem. What do you do with digraphs, that also have a titlecase form? Would you use the uppercase (DZ), or the titlecase (Dz)? -- Liliana 15:17, 3 February 2012 (UTC)
    That depends; what's the form that's used in isolation, the uppercase or the titlecase form? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 19:12, 3 February 2012 (UTC)
    In most languages and translingual entries, I would also stick to the basic caps forms, as in DZ. Of course, follow a language's rules of orthography, or precedent of other dictionaries where the digraph is used (Dutch IJ?). In this case, Dz has zero definitions, and a meaningless description in the translingual entry, so I can't say that there's a reason for this dictionary entry at all. Michael Z. 2012-02-03 22:13 z
  2. Symbol support vote.svg SupportCodeCat 17:07, 1 February 2012 (UTC)
  3. Symbol support vote.svg Support Michael Z. 2012-02-01 17:58 z
  4. Symbol support vote.svg Support Ungoliant MMDCCLXIV 19:59, 1 February 2012 (UTC)
  5. Symbol support vote.svg Weak support for English. Not convinced this is a good idea in general; it's just that I don't like needless duplication of information across entries. Equinox 22:45, 2 February 2012 (UTC)
  6. Symbol support vote.svg Support. Majuscule letters are the "presentation form" meant to, say, be inscribed in stone. For example, titles of books are often entirely in capitals (I see numerous examples in my own library). Capital letters are geometrically simpler, consisting entirely of compositions of straight lines and circular (or elliptic) arches, and perhaps for that reason capital letters are the letters one first learns (as a child). A capital city represents a country (at least politically) even though small villages in it might be much more numerous; and, by analogy, the capital form of the letter should be the lemma for the lexeme. —AugPi (t) 19:39, 3 February 2012 (UTC)
    Some exceptional cases, like the German Eszett (ß), might not, perhaps, make so much sense have the capitalized form as lemma, but, for German, the majuscule form of the Eszett already seems to be being used as the lemma form (with a See also section; the minuscule form doesn't have one; and that See also section links to majuscule forms). As for the Greek sigma with its two minuscule variants, the fact that it has only one majuscule form (and that the same is true for phi), makes majuscules likelier candidates for the lemma forms of Greek letters. —AugPi (t) 20:06, 3 February 2012 (UTC)
I support lemmatising the minuscule forms of letters
  1. Symbol support vote.svg Support one argument I forgot to mention is that some majuscules are really badly supported (hello , and hello to you too, , as opposed to the minuscules ɥ and ɦ). -- Liliana 16:45, 1 February 2012 (UTC)
    Aren't those IPA signs? Why would they have different cases at all? —CodeCat 17:09, 1 February 2012 (UTC)
    They are orthographic letters in certain minority languages. -- Liliana 17:13, 1 February 2012 (UTC)
    Not a good reason. Lack of font support for new characters will always be a transitory problem, and it is purely speculation that in the long run it would affect majuscules more than minuscules. Michael Z. 2012-02-01 17:57 z
    By the way, the first is in Unicode 6.0 and displays correctly on my Mac, the second is from Unicode 6.1, released yesterday, and displays as a box. Michael Z. 2012-02-01 23:36 z
    I can see both, but that's to be expected I guess. Most people won't see either. -- Liliana 15:34, 2 February 2012 (UTC)
    Aren't those out of the scope (Roman, Cyrillic and Greek)? Ungoliant MMDCCLXIV 19:59, 1 February 2012 (UTC)
    These two are used in the Latin alphabets for some African languages, part of Unicode Latin Extended-DMichael Z. 2012-02-02 00:48 z
    I had incorrectly assumed they were in an IPA block. But even if we lemmatise the majuscule we will need exceptions. The letters above were created because of the need of having uppercase in IPA-based alphabets; ß should also be lemma, not . Ungoliant MMDCCLXIV 02:08, 2 February 2012 (UTC)
    Certainly, it is ß, and not , that ought to be lemmatised, but I think that ought to be a reasonable exception to the general "lemmatise majuscules" rule. After all, ß is one of a very few minuscules that (traditionally) have no majuscule forms; in fact, are there any besides the Eszett and kra (ĸ)? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 15:26, 2 February 2012 (UTC)
    ƛ has no majuscule I know of. Other than that, nothing immediately comes to my mind. -- Liliana 15:31, 2 February 2012 (UTC)
    Well, there's this majuscule form: (codepoint: U+A798), but its addition is only proposed hitherto. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 16:15, 2 February 2012 (UTC)
    The Cyrillic modifier letters soft sign ь and hard sign ъ only have uppercase forms for stylistic reasons. Of course other exceptions will come up, but this vote is to determine the default choice, all else being equal. Michael Z. 2012-02-02 15:35 z
    Actually no, Bulgarian has words that start in an ъ, and if those occur at the beginning of a sentence, capital Ъ is used (e. g. ъгъл ("angle")), and it isn't just theoretical exercise either since Slavic languages don't use grammatical articles. -- Liliana 15:40, 2 February 2012 (UTC)
    Oops. I don't know if it's necessary here, but this brings up the question of lemmatizing different forms for different languages. Would it be acceptable to have the main entry in ъ for Russian and in Ъ for Bulgarian? Michael Z. 2012-02-02 18:24 z
    That would be very user unfriendly in my opinion. How would a reader know where to look? -- Liliana 18:47, 2 February 2012 (UTC)
    Each respective entry would say "Lowercase" or "Uppercase form of..." and link to its lemma entry. I'm not saying it's necessarily the best solution here, but I think it could be an acceptable option, especially when these represent a somewhat different letter in each language. Michael Z. 2012-02-02 19:36 z
    Except that they be glossed "Minuscule…" and "Majuscule form of [letter]", I agree with you. Even if we can't achieve consistency across languages, we should at least be able to achieve consistency within languages. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 20:49, 2 February 2012 (UTC)
  2. Symbol support vote.svg Support, though honestly I think we should treat both forms as lemmata. They generally have different meanings (e.g., the Σ of summation vs. the σ of standard deviation), and they're separate Unicode characters, and they're such a closed class. —RuakhTALK 18:35, 1 February 2012 (UTC)
    But this poll is about what to do with, for example Σ and σ, as letters. We ought certainly to have separate entries for different usages of such characters as symbols. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 19:27, 1 February 2012 (UTC)
    Well, but they're always letters. It's not that the character Σ is sometimes used as a letter and sometimes used as a symbol, but that the letter Σ is sometimes used as a symbol. —RuakhTALK 20:19, 1 February 2012 (UTC)
    But why would you want to duplicate pronunciatory, etymological, and usage information in both the majuscule and minuscule entries? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 23:43, 1 February 2012 (UTC)
    I wouldn't — but that's not the question. Workmanlike and kindness and patronizing are all lemmata, but that doesn't mean that all information has to be duplicated from [[workman]], [[kind]], and [[patronize]]. Conversely, [[bid]] has several lemmata that share a pronunciation — so that pronunciation is given only once. —RuakhTALK 02:37, 2 February 2012 (UTC)
    At present, we have the stupid situation where there's a lot of duplicated information at patronize and patronise because neither is lemmatised. In the case of letters, we can give a lot of information — especially, in the case of English ones, pronunciatory information — and for the same reasons that lemmatisation is A Good Idea™ generally, it's a good idea to lemmatise letters. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 15:26, 2 February 2012 (UTC)
    Yes, uncoordinated entries are a real problem affecting our quality as a dictionary. I can't even find our guideline governing lemmatizing, but I seem to remember something that actually required redundant lemma entries for American and British spellings of a term. Bad. Michael Z. 2012-02-02 19:43 z
    As far as I know, the only possibly workable guideline is to lemmatise whatever spelling's entry that was created first. In the case of patronize vs. patronise, that would lemmatise patronize (since it was created in December 2004, whereas patronise didn't exist until March 2007), which shouldn't be controversial. But then there are entry pairs like color vs. colour, where it seems impossible to reach consensus as to which ought to be lemmatised (by the same principle that lemmatises patronize, color (created in December 2002) would be lemmatised, with colour (created in May 2003) becoming a "soft redirect"; color didn't get a proper entry until the 4ᵗʰ of May in 2003, and colour didn't until the 15ᵗʰ of May in 2003, but regardless, the result is the same). — Raifʻhār Doremítzwr ~ (U · T · C) ~ 20:49, 2 February 2012 (UTC)
    How about lemmatizing the earliest attested form, or the most etymologically correct one? I believe this would favour some British and some American spellings. Yes, I'm sure there would be a lot of debate over the specifics. The duplication in English entries also concern capitalizations, including aboriginal/Aboriginal and labor/labour/Labour. (Sorry I'm getting off topic.) Michael Z. 2012-02-02 22:34 z
    Lemmatising the earliest attested form wouldn't work, because then we'd get a lot of obsolete late–fifteenth-century spellings being lemmatised. I'd support lemmatisation by etymological correctitude, but there has been a fair amount of opposition to such proposals in the past. How would you suggest that we resolve the duplication issuing from capitalisation? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 22:53, 2 February 2012 (UTC)
    Well, earliest-attested of the current forms. Capitalization is probably a case-by-case question. I once lemmatized Aboriginal because some style manuals recommend capitalizing it as an ethnonym, but its older twin grew back. I would now be happy to put it at the traditional basic form aboriginal to reduce duplication. In the end, the URL and page title are just convenience labels, and the full story is in the full text of a single entry (and lacks integrity as long as it remains scattered about several). Michael Z. 2012-02-02 23:19 z
    I agree; consolidation somewhere suboptimal is better than no consolidation at all. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 12:51, 3 February 2012 (UTC)
    So let's propose some good lemmatization guidelines. I can't even find the basic common-sense rules we all agree on in WT:English_definitions#Lemma_forms, WT:Lemmas, and WT:About_English#Regional_differences. Am I missing anything? Michael Z. 2012-02-03 14:53 z
    Let's work out lemmatisation rules specifically for letters before we work on ones for terms generally. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 19:14, 3 February 2012 (UTC)
I don't care which form we lemmatise, as long as we lemmatise consistently
I couldn't care less (i.e., I abstain)
  1. Symbol abstain vote.svg Abstain Mglovesfun (talk) 18:00, 1 February 2012 (UTC)
  2. Symbol abstain vote.svg Abstain --EncycloPetey 02:19, 2 February 2012 (UTC) There are problems with selecting only one or the other as lemma form, and so I don't think we can make a choice for one over the other. Some letters in some languages, such as German and Slovak, have only a miniscule form (the majiscule is theoretical but is never used in the language), and in some languages the majiscule has more than one associated miniscule form. I don't think either form should be lemmatized over the other. --EncycloPetey 02:19, 2 February 2012 (UTC)
    Are there some principles you can recommend whereby we might reach ad hoc solutions? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 15:26, 2 February 2012 (UTC)
    Umm... solutions to what? I don't see a problem as everyone else seems to. This went to a poll before the "problem" was clarified. There are quite a few issues being discussed here. --EncycloPetey 03:07, 3 February 2012 (UTC)
    What I meant was, are there some principles you can recommend whereby we might decide which form (be it the minuscule or the majuscule) to lemmatise in any particular case? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 12:51, 3 February 2012 (UTC)
    I think you've misunderstood what EncycloPetey wrote. He didn't write, "I think that one form should be lemmatized in some cases, and the other form in other cases." He wrote, "I don't think either form should be lemmatized over the other." That is, that neither form should be treated as a mere "form-of" of the other form. (Unless, of course, it's I who misunderstood.) —RuakhTALK 22:34, 3 February 2012 (UTC)
    Perhaps you're right in your interpretation, but that just means that EP advocates an unworkable "solution". — Raifʻhār Doremítzwr ~ (U · T · C) ~ 22:47, 3 February 2012 (UTC)
    I interpret any result of this vote as a default choice, a recommendation for consistency when there aren't any specific circumstances that dictate the choice. Obviously, we would lemmatize lowercase and not uppercase ß. But shouldn't the English letter ess have one definition and not three, at S, s, and ſ? We're a dictionary, not a catalogue of Unicode code points. If we can neatly define the diverse verb wrought as a form of both work and wreak, why on earth should we have redundant entries defining the letter J? Michael Z. 2012-02-03 23:56 z
    I agree with your way of interpreting whatever is the result of this straw poll. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 14:30, 4 February 2012 (UTC)
    ſ would deserve special treatment I guess, or how would you describe its use at capital S? -- Liliana 00:12, 4 February 2012 (UTC)
    Most of the information on that letter will be at S, but the information specific to in which circumstances ſ ought to have been used instead of s should be at ſ. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 14:30, 4 February 2012 (UTC)

[edit] Formatting

This is totally retarded! We're voting on something even though none of us even know what the result would look like!

Am I right that if this passed, it would look like this?

Primary form

 ==Translingual== ===Letter=== {{infl}} #the first letter of the Latin alphabet, yadda yadda ===Abbreviation=== #additional case-sensitive meanings ---- ==English== ===Letter=== {{infl}} #the first letter of the English alphabet ---- ==Spanish== ===Letter=== {{infl}} #the first letter of the Spanish alphabet 

Secondary form

 ==Translingual== ===Letter=== {{infl}} #{{secondary form of|[[link to primary form here]]}} ===Abbreviation=== #additional case-sensitive meanings 

Or do you want sections for every single language in the secondary form, whichever it will be? -- Liliana 15:17, 3 February 2012 (UTC)

English a is a form of English A, so there would be a language section. Michael Z. 2012-02-03 15:56 z
If we do decide to have language sections for every language at the secondary form, then there's nothing gained from choosing one form, as you need to synchronize the two entries anyway (add/remove form-of entries etc), in which case this very discussion is pointless. -- Liliana 16:16, 3 February 2012 (UTC)
This is a separate and larger question. We never use "translingual" as a substitute for individual language entries.
Besides that, English a is likely to remain the minuscule firm of A during our lifetimes, so I don't understand what synchronizing problems exist. On the other hand if we don't lemmatize letters, then a letter entry would become out-of-sync or even contradictory with every single edit. This gives the advantages of the w:DRY principle. Michael Z. 2012-02-03 16:51 z
Yes, but a form-of entry of a letter can still contain a pronunciation, audio file, possibly homophones, external links, and Daniel-style lists. Those still have to be synchronized if we were to keep the language entries, so there would be almost nothing saved in maintenance required. -- Liliana 16:56, 3 February 2012 (UTC)
anything that needs to be synchronized should be moved to the lemma entry. Anything else doesn't need to be synchronized. That's the whole point. It serves the task of the editors, the integrity of our information, and the goals of our readers. Michael Z. 2012-02-03 17:09 z

If anyone doubts that a letter entry can contain extensive information, I invite you to read the NED's and OEDs' entries, links to which I have provided here; hopefully, they will show the need to lemmatise letters. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 19:10, 3 February 2012 (UTC)

To clarify, what I'm saying is that all that information should be in the entry for one letter form only, and not duplicated over both (or, in some cases, all) the letter forms. I hope that that is not controversial. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 13:37, 4 February 2012 (UTC)
All what information? Exactly which information do you and others think can be consolidated and which cannot? We can't consolidate the pronunciation information for A and a at the majiscule entry, because the miniscule is both a letter and a word. Likewise, we can't consolidate the etymology at the majiscule entry, because the word a has a separate etymology requiring multiple etymology sections. We can't consolidate quotations (if we have them) because we want quotations to support each form of a term/item. So what information do you think can be consolidated? --EncycloPetey 04:42, 8 February 2012 (UTC)
The article a and the abbreviations A and a are entirely different from the letter A, a; you're just confusing them because of homography. The entry for "A, n." in the OED [3ʳᵈ ed., June 2011] has this in its "Etymology" section (I've just copied and pasted it, so its formatting, links, &c. have not been reproduced, but it'll give you an idea):

OED [3ʳᵈ ed., June 2011] etymology section for "A, n."

Letter form. The first letter of the English alphabet, as it was of the ancient Roman Alphabet (and as were its prototypes Alpha of the Greek alphabet, and Aleph of the Phoenician and ancient Semitic alphabets). The English letter form capital A reflects Latin A , itself reflecting Greek Α (capital alpha).

Letter name. a is usual as the name of the letter in classical Latin, and hence in English. (In ancient Greek the name of the letter was ἄλϕα alpha n.) The plural has been written aes , A's , As .

Sound. In both Greek and Latin this symbol represented the vowel formed with the tongue in the lowest position in the mouth, distinguished by vowel height from the next closest (front and back) vowel sounds represented by e and o . Long and short /a/ phonemes existed in each language. In Old English there was additionally a phonemic contrast between low front and back vowels; the back sound /ɑ/ (and its long equivalent) was represented by a , and the front sound /æ/ (and its long equivalent) by the digraph æ (called ash : see ash n.4). This phonemic contrast is not found in the sound system of later stages in the history of English, a subsequently being typically the representation of the lowest sound in the English sound system (as in Greek and Latin) rather than of a distinctively low back sound (as in Old English).

In modern English the symbol a typically represents:

(i) British English /a/ , U.S. English /æ/ , in e.g. man , rat (by some phoneticians the British English vowel is transcribed as /æ/ rather than /a/ );

(ii) British and U.S. English /eɪ/ , in e.g. name , rate ;

(iii) British English /ɑː/ , U.S. English /ɑ/ , in e.g. father , palm ; in British English typically also in e.g. calf , half , and (in some, typically southern or RP, varieties) also in e.g. bath , fast , dance , sample (standard lexical sets palm and bath : see J. C. Wells Accents of English (1982 ) I. p. xviii–xix, 142–4, 133–5);

(iv) British and U.S. English /ɪ/ , in e.g. village or (sometimes) climate ;

(v) British and U.S. English /ə/ , in e.g. comma , amoeba , or (sometimes) climate ;

(vi) (after w ) British English /ɒ/ , U.S. English /ɑ/ , in e.g. wan , watch , want ;

(vii) (after w ) British English /ɔː/ , U.S. English /ɔ/ , /ɑ/ , in e.g. war , warm , water .

On the spelling for British and U.S. English /ɛ/ in many and any see discussion at many adj., pron., n., and adv.

A is the first letter of several digraphs, as follows:

ai , ay , representing:

(i) British and U.S. English /eɪ/ , in e.g. pain , pay ;

(ii) (before r ) British English /ɛː/ , U.S. English /ɛ(ə)/ , in e.g. pair ;

(iii) (rarely) British English /ʌɪ/ , U.S. English /aɪ/ , in e.g. aye 'yes' or (in British English) Isaiah ; (in Scots and sometimes in English regional (northern) use this digraph also occurs for /e/ in e.g. ain 'own', ait 'oat', etc.);

au , aw , representing:

(i) British English /ɔː/ , U.S. English /ɔ/ or /ɑ/ , in e.g. laud , law , taut , taught , caught ;

(ii) British English /ɑː/ or /a/ , U.S. English /æ/ , in e.g. laugh , draught , aunt ;

(iii) British English /ɒ/ , U.S. English /ɔ/ , in e.g. laurel , Laurence (compare also Lawrence );

(iv) British English and U.S. English /eɪ/ , in e.g. gauge ;

(v) British English /əʊ/ , U.S. English /ɔ/ , /oʊ/ , or (in some cases) /ɑ/ , in e.g. mauve , sauté ;

(vi) (rarely, in unnaturalized borrowings) British and U.S. English /aʊ/ , in e.g. Rathaus , luau ;

ae , representing (chiefly in words ultimately of Latin and Greek origin, the pronunciation of many of which varies considerably):

(i) British English /iː/ , U.S. English /i/ , in e.g. aeon , aetiology , or (sometimes) abscissae ;

(ii) British English /ɪ/ , /ə/ , U.S. English /ə/ , in e.g. Aeneas , Aegean ;

(iii) in U.S. English sometimes also /eɪ/ in e.g. aegis , Aegean ;

(iv) in U.S. English /aɪ/ and in British English sometimes /ʌɪ/ , in e.g. abscissae (as also regularly in words of Italian origin, as maestro );

(v) British English /ɛː/ , U.S. English /ɛ/ in aerial and related words;

(vi) British and U.S. English /ɛ/ in e.g. haemorrhage .

ao , representing British and U.S. English /aʊ/ in e.g. tao , Maoism , maori ; also, in gaol , British and U.S. English /eɪ/ ;

aa , representing (with varying distributions in different words) British English /ɑː/ , /a/ , /eɪ/ , or /ɛː/ , U.S. English /æ/ , /ɑ/ , /eɪ/ , or /ɛ/ , in e.g. ma'am , maas , Baal , Aaron .

Main developments within English. The following gives a very brief outline of the origins and development of the main sounds represented by a in English.

(i) The short vowel.

In Germanic, short a corresponds both to a in other branches of Indo-European (compare e.g. Latin ager with the early forms and Germanic cognates at acre n.) and (as a result of an early merger in Germanic) to o (compare e.g. Latin hostis with the early forms and Germanic cognates at guest n.).

As a result of a very early sound change in English short a (of whatever origin) in accented syllables was fronted to æ except when followed by a nasal consonant (compare dæg 'day' with mann 'man', and on the latter see further discussion at O n.1), although a was later restored in open syllables before a following back vowel (compare within the paradigm of dæg 'day' the nominative and accusative singular forms dæg alongside the nominative and accusative plural forms dagas ). In Old English a and æ were distinct phonemes, and were affected by neighbouring sounds in different ways which have profound effects on the subsequent histories of many words; in some dialects of Old English (Kentish and some Mercian varieties) æ was also fronted further to e . In Middle English the phonemic distinction between a and æ was lost, surviving instances of æ (which had been neither affected by further sound changes nor fronted to e ) generally being merged with a .

Sound changes have affected the reflex of Middle English a in a number of words:

(i) after w , a was rounded in e.g. wan , watch , want ; this probably happened in the early 17th cent. or earlier, although even in standard English there was a good deal of variation in the 17th and 18th centuries, and later in some cases (compare swam , past tense of swim v., and also pronunciation history at quaff v., waft v.1); in some cases the resultant sound occurred in a lengthening environment, as in war , warm (and anomalously in water );

(ii) father , palm , calf , half , bath , fast , dance , sample show various different processes of lengthening of historically short a in modern English; these changes have not all occurred in all varieties, and today provide significant distinctions between different national varieties of English (and in the case of the classes illustrated by bath , fast , dance , sample , considerable variation within standard British English).

A further source of /a/ in modern English is lowering of e before r in late Middle English (as also in Anglo-Norman and Old French, Middle French), as in carve or tar , although the application of this sound change is very variable (compare person n. and parson n.), and in some cases the e spelling is retained even when the change in pronunciation to /a/ is found (compare clerk n.).

(ii) The long vowel.

The history of ā and ǣ is much less straightforward than that of the short sounds.

Owing to an early merger of ā and ō in Germanic, ā in most other Indo-European languages corresponds to ō in Germanic (see O n.1).

The main source of Old English ā is Germanic ai : compare e.g. Old English stān 'stone' with the Germanic cognates listed at stone n.; in the Middle English period this sound became rounded in southern and midland dialects, giving Middle English open ō (see O n.1); in the north and Scots this rounding did not occur, and Old English ā is generally continued in Scots as /e/ (often spelt ai , as in ain , ait , stain , more commonly stane ), and in northern English varieties frequently as a falling diphthong with high first element, /iə/ .

Old English ǣ as a spelling form represents sounds of two different origins with different distributions in the various Old English dialects. Firstly, it shows the reflex of Germanic ē , corresponding to ā in most other West Germanic (and North Germanic) languages; see e.g. the cognates listed at deed n. In the dialects of Old English other than West Saxon this sound (often called ǣ 1) was ē rather than ǣ , hence West Saxon dǣd beside Anglian and Kentish dēd . Secondly, ǣ resulted from the i -mutation of ā (although in Kentish this was further fronted to ē ), as in heal v.1, which is hǣlan in both West Saxon and Anglian; this sound is often called ǣ 2. The reflexes of both ǣ 1 and ǣ 2 in modern English are mid or high vowels, generally spelt e , ea , or ee , and their later history is therefore dealt with at E n.1

Middle English ā therefore does not continue Old English ā (or ǣ ). Its main origins are instead:

(i) early Middle English lengthening of a in open syllables in disyllabic words;

(ii) borrowing of words containing Anglo-Norman and Old French, Middle French ā ; also Anglo-Norman and Old French, Middle French au in certain phonological contexts, as in save , chamber .

As a result of the Great Vowel Shift this sound became raised to a mid height vowel and subsequently diphthongized; however, a spellings were preserved as a result of the degree of standardization and conservatism by this stage found in the spelling system. (The new long low height vowel created by various lengthening processes after the Great Vowel Shift has already been described above.)

Brief notes on digraph spellings in modern English.

ai , ay generally reflects: (i) Old English or early Middle English diphthongs formed from a low vowel before palatal g , as in day n.; (ii) Anglo-Norman and Old French, Middle French ai , as in pay v.1 or bailiff n. Since ai and ei merged in Middle English (see E n.1), words historically showing ei sometimes show ai or ay spellings in modern English, as e.g. sail n.1, way n.1

au , aw generally reflects: (i) a low vowel before w in Old English, as in claw n. or raw adj.; (ii) Old English or early Middle English diphthongs formed from a low vowel before velar g or the fricative /x/ , as in law n.1 or slaughter n.; (iii) Anglo-Norman and Old French, Middle French au , as in jaundice n., laundry n., aunt n.

Modern spellings with ae or æ show no continuity with æ spellings in Old English or early Middle English, but instead mostly show learned borrowings of Latin words showing (in classical Latin) the diphthong ae . In some instances ae in such words ultimately reflects substitution in Latin of ae for Greek αι in words borrowed into Latin from Greek; hence aether and æther as spellings of ether n. Many words which historically showed either ae or e spellings now always or predominantly show e (as e.g. ether n., phenomenon n.); ae is retained in many proper names relating to the ancient world, such as Caesar or Aeneas , and in some technical terms, such as (in British English) aetiology n., as well as a few slightly commoner words, such as aegis n.

If you don't think all that's worth consolidating in one place, then there's clearly no argument I can make to persuade you. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 04:28, 16 February 2012 (UTC)

[edit] Interwikis

What is going on with the iw's? Jcwf 00:15, 1 February 2012 (UTC)

Very weird, isn't it? The translations are not linked to other wikis either (if that's not the same issue). --Anatoli (обсудить) 00:31, 1 February 2012 (UTC)
[[half]] seems to be fine (both interwiki-wise and translation-link-wise); and I just tried making a null edit, to see if it might be a parser issue, and the null edit did not break it. So I don't know why some pages would be affected and some not. —RuakhTALK 01:17, 1 February 2012 (UTC)
Oh, but now WT:BP is fine. So maybe it was a parser issue, but has been fixed? If you see any other pages with this problem, maybe try making a null edit? (That is, going to the "Edit" tab and clicking "Save page" without making any changes. That won't show up in the edit-history, but it will cause the page to be re-parsed.) —RuakhTALK 01:20, 1 February 2012 (UTC)

[edit] Definition of article

I heard namespace 0 pages no longer require an internal link to be counted as article. Is this true for all wiktionaries? If so I'll update wikistats accordingly. Thanks, Erik Zachte 05:16, 1 February 2012 (UTC)

I believe it's true for all Wikimedia wikis; can anyone confirm this? Mglovesfun (talk) 13:53, 4 February 2012 (UTC)

[edit] Words requiring another word

I am thinking about words requiring another word, such as unrained (on, upon) or undwelt (in). Is there any term for these? Should we treat them in any special way, e.g. categories? Equinox 01:11, 3 February 2012 (UTC)

Hmm:
  • 1996, Herbert M. Collins; Franklin R. Hall, Michael Hopkinson, Pesticide formulations and application systems, volume 15, page 187:
    A relative value (the visual rating) was thus obtained for the test formulation. The "unrained" surfaces were not rated visually in this study as the final aim of the method evaluation was to compare the values of the "rained" surfaces of the test formulations to the "rained" surfacs of the reference formulations.
  • 2002, Demografie, volume 43-44: 
    Development of the number of dwellings registered also a considerably faster rate of growth as regarded undwelt flats ... For the first time since 1970 even the absolute growth of the number of undwelt flats was higher than of those....
But both are formed in parallel to phrasal verbs. DCDuring TALK 16:40, 3 February 2012 (UTC)
All kinds of words require other words, for example, rational number requires integer, circle requires point, etc. —AugPi (t) 19:09, 3 February 2012 (UTC)
I think Equinox means that class of words that must be construed with a preposition. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 19:18, 3 February 2012 (UTC)

[edit] Deleting empty categories; yea or nay?

I'd like to think it's ok (but not mandatory) to delete any empty category which isn't meant to be empty most of the time. To qualify 'meant to be empty most of the time', I mean like Category:French nouns lacking gender or Category:Spanish plurals, where ideally they would never be used, but are there to catch entries with problems.

Argument for deleting empty categories: it's rather irritating to click via a link or by typing into the search box and find an empty category, such as "This category is for the [foo] names of various languages." and then having zero entries. I'd prefer a red link to a blue linked category with nothing in it. NB when the category is valid but empty, such as Category:Old Provençal terms derived from Persian (example), it can be restore immediately when used. I say this specifically in relation to Special:UnusedCategories where there are around 2000 at the moment. I think it's okay to delete the majority of these. Mglovesfun (talk) 11:03, 4 February 2012 (UTC)

I think that it is OK to delete these. People can always add such categories to their watchlist (I watch the deleted Category:Tbot entries (Italian) for example) in order to catch problems. SemperBlotto 11:13, 4 February 2012 (UTC)
It's OK with me if you delete those categories, but could you add their preambles to their respective talk pages before you do so, please? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 14:11, 4 February 2012 (UTC)
Yes but I wouldn't, it would be a waste of time. If the category is used again it can be restored. Too many more important things to be done round here. Mglovesfun (talk) 14:12, 4 February 2012 (UTC)
Obviously, there's no point in copying preambles that are merely generated by templates like {{prefixcat}}, but I think you should copy preambles that are manually inputted. Wouldn't you agree? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 14:35, 4 February 2012 (UTC)
Until we have a way to automatically generate categories as soon as they are needed, I'd really rather these kinds of categories be kept. When these are needed again, they are likely to stay redlinked for quite a while. --Yair rand 09:48, 5 February 2012 (UTC)
Yes, I agree. —RuakhTALK 23:08, 5 February 2012 (UTC)
Keep them IMO. Categories meant to be empty and there to catch mistakes are useful for (a) preambles and (b) __HIDDENCAT__. I don't understand the argument "it's rather irritating to click via a link or by typing into the search box and find an empty category": where does one see such a link (except of course on the bottom of a page, where it exists even if it's red)? How often does one search for categories by name in the search box?​—msh210 (talk) 18:30, 5 February 2012 (UTC)
I think it's more generally a case by case basis, but I'd have to lean towards Msh210 (talkcontribs) and say to keep them. -- Cirt (talk) 23:58, 13 February 2012 (UTC)

Per request I am posting this here.

  • Double redirects are redirect pages that link to redirects
  • There are five types of double redirects, only one of them (first type) is typically fixable by bot
    1. ordinary double redirects: Redirects that link to other redirects that eventually lead to an article
    2. protected double redirects: Redirects that are protected from edits that link to other redirects
    3. external double redirects: Redirects that link to other redirects that eventually lead to a wiki page on another wiki
    4. self redirects: Redirects that link to themselves
    5. redirect loops: Redirects that link to other redirects that do not lead to an actual article
  • Double redirects are a navigational hazard for the reader as they will not re-redirect the user.
  • Pywikipediabot has redirect.py which can be used to handle ordinary double redirects (type 1 in the list above) when used with "double -always" parameters. (intended code here)
  • En.wiktionary gets a few double redirects in a blue moon. Bot flag may be unnecessary. en.wiktionary has no double redirects currently. That said if there are many double redirects created such as with username renames or mass move of articles there would be a flood of recent changes so a bot flag could be a good idea.
  • Human edits are unnecessary as the task is mundane and routine, it would be a waste of human time to keep watching the special page as well as carry on with the edits that can be delegated to bots.
  • Bot operates on practically every wikimedia wiki currently

I hope this gives a good general idea about the problem and how bot edits can help. -- Cat chi? 02:50, 5 February 2012 (UTC)

Does KassadBot or another already do these? (I assume not, but figured I should check.)​—msh210 (talk) 18:33, 5 February 2012 (UTC)
It doesn't touch redirects -- Liliana 18:38, 5 February 2012 (UTC)
Anyone have an idea how often this is an issue? Roughly how many edits this bot will make per, say, year?​—msh210 (talk) 17:36, 7 February 2012 (UTC)
It entirely depends on user activity. If no one moves pages no edit would be made and how many redirects point to the moved redirects. page moves do happen. Currently the bot would make edits once in a blue moon. It would probably be a few edits per year, however do consider a scenario where:
  • Page A with hundreds of redirects linking to it
  • Page A is moved to page B
  • Bot would make hundreds of corrections flooding the RC feed.
I however noticed a pattern of the deletion of older redirected pages. Examples include renames of:
I do not know if such deletions are based on past consensus but it is my belief that deletion of redirects is a bad method. It makes the entire site difficult to cite as for instance someone citing legerrio would not be able to retrieve the information again. Furthermore with such deletions all discussions renamed accounts previously participated will become a redlink removing proper attribution to comments. This is probably a separate discussion so I do not want to indulge in it too much but this is something to consider.
-- Cat chi? 09:16, 12 February 2012 (UTC)

[edit] Misuse of "uncountable"

I've just noticed that North Pole is marked as uncountable. This is incorrect. "Uncountable" refers to the non-existence of a plural form of a word, not the uniqueness of the thing the word refers to. It is perfectly possible to form the phrase "North Poles" even though the Earth has only one. (In any case, other planets have a North Pole, and so we could say "the North Poles of the planets in the Solar System".)

Would someone like to volunteer to replace "uncountable" with plurals in entries that are actually countable nouns? — Paul G 16:12, 5 February 2012 (UTC)

Unfortunately I used to get this wrong quite a lot, usually where I really meant {{en-noun|!}} i.e. no plural attested (without being a mass noun). Equinox 16:17, 5 February 2012 (UTC)
I don't think one actually could say "North Poles"; it would be north poles, uncapitalized. North Pole is a proper noun, so it should use the en-proper noun template, not the en-noun template. --Yair rand 16:17, 5 February 2012 (UTC)
  • 1822, The gentleman's magazine, and historical chronicle, volume 92, Part 2, page 212: 
    There is a satisfactory proof that the conjoint action of the two North Poles occasions the line of no variation.
  • 1947 October 20, "Three Magnetic Poles In Arctic", Milwaukee Sentinel:
    Army aviators have established a year 'round defense against Russian attack across the Arctic and have added the discovery of two magnetic North Poles to one previously known.
  • 2005, James Maxlow, Terra Non Firma Earth:
    Figure 39 Recent geomagnetic North Poles plotted as small circle arcs.
    Counterexamples. DCDuring TALK 17:20, 5 February 2012 (UTC)
    I stand corrected. Are those noun senses or proper noun senses, though? --Yair rand 08:56, 6 February 2012 (UTC)
    As a quick answer: I don't know. I would think that the magnetic pole and the rotational pole would each be a proper noun. But similarly, the Durings would seem to be a proper name as well, perhaps short for a list of full names or referring to a complete lineage without anyone being able to identify all the members of the group. DCDuring TALK 19:15, 6 February 2012 (UTC)
I've gone through and got several that I think might have been mismarked, but since I'm new to Wiktionary (well, I registered in 2004, but only to correct a prescriptivist who was being a prick about the singular they), I won't do more until someone checks my recent contribs to that effect. —Quintucket 18:25, 5 February 2012 (UTC)
In the case of Laserdisc I have now split it into Proper Noun and (countable) Noun sections. The two use different templates. Equinox 18:29, 5 February 2012 (UTC)
There is also {{singulare tantum}}. Mglovesfun (talk) 10:59, 6 February 2012 (UTC)

[edit] Latin -que compound words

Following up on the little discussion there was last year in RFV (when archived: Talk:fasque), I've open what I hope can be a bigger discussion about our policy towards -que on the WT:ALA talk page (Wiktionary talk:About Latin#Latin -que compound words). - -sche (discuss) 20:07, 7 February 2012 (UTC)

[edit] Citations from online sources

CFI says: "As Wiktionary is an online dictionary, this naturally favors media such as Usenet groups, which are durably archived by Google." Until recently, I have understood this to mean that online sources are acceptable as long as they meet a certain "durability" threshold, one presumably lacked by forums or everyday people's blogs and journals. Thus, I've been culling citations from content found on sites like CNN.com, The Huffington Post, Gamespot, etc. for a while now, and have seen the same done by others.

But I've been left with questions about the acceptability of drawing citations from online sources after this discussion on RFV. If it's really the case that online sources are generally considered unacceptable, it doesn't make sense to me why Usenet would be the exception to the rule, because I can't see any special quality that sets Usenet apart. I'm puzzled that it would be considered acceptable to draw citations from Usenet, but not from content on any other online source, no matter its stature, and am concerned about how this seemingly arbitrary limitation would adversely effect my ability to attest words and phrases.

Can I get some clarification? I'm honestly confused here. Astral 23:16, 7 February 2012 (UTC)

I think the main difference is that usenet isn't owned by a single entity but can be mirrored by anyone on the internet. That means that no single entity can take the sources offline either, which is what gives them their durability. —CodeCat 23:45, 7 February 2012 (UTC)
Does this mean that non-Usenet online sources like CNN.com should not be used for citations? Astral 01:41, 8 February 2012 (UTC)
I think it does mean exactly that. If a given entity has a policy that says that it will archive all articles as originally written, it might be worth considering, but such policies can change. As long as the material is copyrighted, even the legality of archival copying is at issue.
On the general question of archiving digital information, consider the following:
  • 2001, Bruce Sterling, Digital Decay[5]:
    Originally delivered as the keynote address for Preserving the Immaterial: A Conference on Variable Media at the Solomon R. Guggenheim Museum on March 30, 2001
    Bits have no archival medium. We haven't invented one yet. If you print something on acid-free paper with stable ink, and you put it in a dry dark closet, you can read it in two hundred years. We have no way to archive bits that we know will be readable in even fifty years. Tape demagnetizes. CDs delaminate. Networks go down.
DCDuring TALK 02:11, 8 February 2012 (UTC)
I'm not sure he's comparing like to like here. You can print bits on acid-free paper with stable ink. Printing a DVD on acid-free paper with stable ink would take a lot of space, but you can fit 17,000 books on there[6], and many an organization that has tried to store that quantity of paper have lost it to fire or water. Is the ongoing maintenance required to keep 17,000 books safe cheaper or easier than making an annual copy of a DVD? Or if you trust film stock (and they swear that it will last hundreds of years), even if you only stuff 640 x 480 b/w bits per frame, an hour and a half of film will hold as much as a DVD. We can't permanently archive bits in the quantities we're used to slinging around, but bits and the information stored in them haven't got harder to store.--Prosfilaes 03:05, 8 February 2012 (UTC)
(edit conflict) Isn't it counterproductive for an online dictionary that bills itself as such ("As Wiktionary is an online dictionary...") to avoid citing online sources on the principle that digital media doesn't last as long as paper? Digital media is cheaper and consumes a lot less space than paper media, meaning that, in the 21st century, there's more incentive to build and maintain digital archives. But it would seem more beneficial to have citation standards based on concrete criteria — like Wikipedia's RS — rather than abstract ideas about the relative permanency of various media formats. Astral 03:13, 8 February 2012 (UTC)
It's not an abstract idea; it's a concrete practical solution to the idea of being able to check a citation in a decade or two. Why is it counterproductive for an online dictionary to avoid citing online sources? I don't see the connection there.--Prosfilaes 03:21, 8 February 2012 (UTC)
If it was concrete, when I asked what the citation standards were, I would have been directed to a policy page with clearly defined and outlined criteria. The information I've found or been given has been contradictory. CFI says, "As Wiktionary is an online dictionary, this naturally favors media such as Usenet groups," but users are telling me that online media isn't appropriate for citation because it isn't as lasting as paper media (which is debatable). Except Usenet. Astral 04:01, 8 February 2012 (UTC)
There's a difference between ill-defined and abstract. It's not debatable that online media isn't as lasting as paper media; most books are owned by several libraries in permanent collections in formats that have a life expectancy of centuries, as well as being held by Google and UMich in online formats.--Prosfilaes 04:47, 8 February 2012 (UTC)
I don't see your "durability" threshold. I just don't see any evidence that CNN and friends tend to stick around longer than anyone else. It doesn't take a lot of money to stay online; I bet a small website could stay online in perpetuity for $10,000. But it does take a will to do so, and I see no evidence any of them have made claim that that's a goal of theirs. Moreover, if we plan on sticking for another decade or two, I'm not sure that we can trust even those claims.--Prosfilaes 02:47, 8 February 2012 (UTC)
I still don't get how Usenet is somehow the exception to the "digital media is not durable" rule. Copyright argument aside, Usenet archives are just as prone to the whims of fate as any other online source, i.e. just as likely to be rendered inaccessible through the shut down of a site or succumb to storage medium decay or destruction. It's not really feasible to base citation standards around personal suppositions about what media formats or sources are the most "durable," because there's no way to conclusively know how technology is going to progress. Astral 03:43, 8 February 2012 (UTC)
In reality, the inclusion of Usenet has more of a practical purpose; it allows us to include relatively recent slang words that would otherwise be unattestable. -- Liliana 04:11, 8 February 2012 (UTC)
But there's no one source of Usenet, and Usenet gives an implied license to archive to basically anyone. (I believe there's an X-Archive: No header or something that can be used to rebut that presumption, but most Usenet posts don't have that.) CNN can and does unilaterally take down posts. It's obviously feasible to base citation standards around suppositions of what media formats are most durable, because we've done it. A lack of conclusive knowledge has nothing to do with the feasibility, merely the wisdom. While we don't know conclusively anything, I think our choices have a good chance of being correct; libraries, particularly academic libraries, aren't going anywhere quickly, and Google and UMich are working on making paper sources also online ones.
You want to attack Usenet? Okay, but I don't think it will win you what you want. Usenet is an exception to our general rules because it's such a convenient corpus. I'm guessing that arguing that it's no more durable than online materials, if it provoked a chance, would be more likely to exclude Usenet as a citation source then add arbitrary online sources.--Prosfilaes 04:47, 8 February 2012 (UTC)
  • Comment: Keep in mind please, just because a source goes offline, does not mean it is not durable. It can still be accessible in news archive sources like Newsbank, or Lexis Nexis, or Westlaw. Cheers, -- Cirt (talk) 05:14, 8 February 2012 (UTC)
    What are the inclusion policies of those organizations? DCDuring TALK 08:57, 8 February 2012 (UTC)
    They're durably archived, digitally, microfiche, the works. -- Cirt (talk) 18:52, 8 February 2012 (UTC)
    I meant: what content do they include from, say, CNN? Do they include user comments, CNN replies? Do they include all original postings or just final corrected versions? Their content is behind a paywall, isn't it?
    It goes without saying that what they have has the same copyright restrictions as the original, possibly extended by the addition of access aids, such as keywords. DCDuring TALK 19:25, 8 February 2012 (UTC)
    • Really I want to remove that "durable" part. Nothing in the world is durable, apart from stone tablets. -- Liliana 05:46, 8 February 2012 (UTC)
      Durable doesn't mean infinitely durable. We can be reasonably sure that print works on paper won't survive more than a few hundred years. I actually feel a little better knowing that print works are also archived digitally. Print works that exist only on high-acid paper, introduced about 150 years ago, are unlikely to last in that form for three hundred years from the printing. Apparently the problem is particularly serious for works printed in Russia and eastern Europe.
      If there were several multiple paper copies of the Usenet archives using acid-free paper, I would feel better than depending solely on the multiple electronic copies that I am told exist. Perhaps the site of the Norwegian seed and DNA repository could be used as one site for such storage. Perhaps copies of annual editions of the WMF projects could also be so archived. Perhaps some funding could be found for such a noble purpose. DCDuring TALK 08:57, 8 February 2012 (UTC)
I'll maintain my usual line that "durably archived" is bollocks and needs to go completely. Nobody can know which resources will last and will not. It would violate WP:CRYSTAL (Wikipedia is not a crystal ball) but doesn't since we're not Wikipedia. But anyway, I would very happily dump it completely. Our current solution is just to totally ignore the meaning of "durably archived" and interpret as meaning "published works and Usenet", which isn't a meaning but rather a description. Mglovesfun (talk) 16:52, 8 February 2012 (UTC)
I agree it's not really very clear, and I think your definition is actually clearer. I would support modifying CFI so that it defines appropriate sources as such, instead of calling them just 'durably archived'. —CodeCat 18:13, 8 February 2012 (UTC)
Er, a lot of online works are published in some sense. "Printed works and Usenet", perhaps.--Prosfilaes 20:41, 8 February 2012 (UTC)

I'm starting to agree with most of the other folks in this thread above, "durable" is kinda silly wording and should just be trimmed out. Newsbank, or Lexis Nexis, or Westlaw are all perfectly find as sources, and are archived, on microfiche, and digitally, and have survived for a long time and will continue to be archived successfully and available very easily to any researcher, and should be weighted equally to online sources. -- Cirt (talk) 18:52, 8 February 2012 (UTC)

Problem with the word durable might be that it might be interpreted as permanent - as the synonyms section of its entry suggests. But if the word has its comparative and superlative (as its entry suggests), than it can't be equated with the word permanent so easily... AFAICT. And if durable then ain't synonymous with permanent, I'd say it could serve the purpose for CFI. At least if it is reworded to "archived in an extensively durable manner such as Usenet..." or s.t. --BiblbroX дискашн 20:33, 8 February 2012 (UTC)
Actually, if the word permanent has its comparative and superlative then maybe I am completely wrong about its meaning. --BiblbroX дискашн 20:35, 8 February 2012 (UTC)
All (almost all ?) adjectives that have an absolute sense (not gradable or comparable) also are used otherwise. See unique, for example. I find it hard to take absolute meanings seriously except in mathematics. Astronomy, geology, and history all favor non-absolute meanings, IMO. The field of archiving and storage is the realm of man-made artifacts, which seems a particularly poor realm for absolute meanings. DCDuring TALK 22:12, 8 February 2012 (UTC)
If they're printed on microfiche, then they are printed sources and already clearly usable under CFI. Moreover, my problem is with "CNN.com, The Huffington Post, Gamespot, etc.", and the theory that any and all text on those sites (include etc.) can be trusted to be durable.--Prosfilaes 20:41, 8 February 2012 (UTC)

CNN.com gets archived to news archive sources like Newsbank, Lexis Nexis, Westlaw. Those news archives are stored on microfiche. Therefore, CNN.com is durable. -- Cirt (talk) 23:50, 8 February 2012 (UTC)

I don't see how they get the videos, and they certainly don't archive the comments, and I would be surprised if now and forever there was no corporate blog or other informal stuff that didn't get so archived. But those are largely quibbles. If we want to make a list of those sites that are archived by such processes, that would be cool and useful. I see that as an affirmation of our current (somewhat de facto) policy, and not an encouraging of arbitrary websites.--Prosfilaes 01:49, 9 February 2012 (UTC)
Oh, agreed, of course. -- Cirt (talk) 03:48, 9 February 2012 (UTC)

Don't forget that durability is mentioned in CFI for verifiability purposes. Stating that a word does not exist or is not worth an entry only because citations are from media not considered durable enough would be absurd. And, again, Internet pages can be durably archived by our software when needed. Lmaltier 21:47, 10 February 2012 (UTC)

Surely not unless we get permission from the copyright holder. Equinox 21:50, 10 February 2012 (UTC)
Well, there's also Internet Archive. -- Cirt (talk) 23:57, 13 February 2012 (UTC)
We are in no way capable of tracking any significant segment of English outside what is durably recorded. Such is better left to dedicated dictionaries with dedicated scholars authoring them. The value over the long term of non-durably recorded terms is zero, as nobody will be looking them up.--Prosfilaes 22:48, 10 February 2012 (UTC)

This is currently the category for such terms as lion, tiger and jaguar. The idea is presumably that a "panther" is any species of Panthera, but I have never in my life heard panther used this way, it's not in the OED, and even if some citations can be found to support it, it's a very misleading name for a category of this sort. If we want to be that specific we should just go ahead and call it Category:en:Species of Panthera, otherwise why not just use Category:en:Big cats like everyone in the real world actually does. Ƿidsiþ 07:21, 8 February 2012 (UTC)

Big cats doesn't have an exact definition (for example, pumas and cheetahs are sometimes considered and sometimes not considered big cats). I don't think using Species of Panthera is ideal either; words like Latin pantherinus, which is related to panthers, but not a species, would be excluded. If we are to change, I suggest just Category:en:Pantherinae (includes Panthera, snow-leopard and clouded leopard), but I don't think it's ideal either.Ungoliant MMDCCLXIV 13:36, 8 February 2012 (UTC)
Categories are used to make searches easier, they should be designed for readers. Therefore: 1. They don't have to have a precise scientific definition. 2. Their names should be clear (Pantherinae is OK in Wikipecies, but not in a language dictionary; furthermore, the precise scientific classification changes over time, sometimes often, e.g. for fish, and these changes are irrelevant here).
I think that Big cats is an ideal name for this category. Lmaltier 21:37, 10 February 2012 (UTC)
Big cats works for me, though that term has a fuzzy boundary. ~ Robin 12:11, 16 February 2012 (UTC)
The simplest fix I can think of is delete Category:en:Panthers and have all the terms in Category:en:Felids or Category:en:Felines. There are not all that many of these terms. --Dan Polansky 12:23, 16 February 2012 (UTC)

[edit] Numbers or Numerals

I am kind of confused. [[Category:Latin numerals]] and [[Category:Latin numbers]]. Two very same categories. --KoreanQuoter 15:27, 8 February 2012 (UTC)

It's a long dispute that has, to my knowledge, never been resolved. You can find a lot of it by searching in the archives. -- Liliana 15:51, 8 February 2012 (UTC)
What Liliana-60 said (to put it mildly). Mglovesfun (talk) 16:47, 8 February 2012 (UTC)

[edit] Definitions as sentences

I am once again reminded of Wiktionary:Votes/pl-2009-03/ELE Amendment 1. On the French Wiktionary we treat all languages the same, and all definitions are treated as sentences, even when it's a one word translation. on User talk:Mglovesfun/Archives/1#formatting, User:Widsith said "Just a small point, but glosses from foreign languages into English shouldn't end in full stops. Just the translation(s) alone is fine. Thanks!" This is absolutely the most common practice, but WT:ELE actually says "Each definition may be treated as a sentence: beginning with a capital letter and ending with a full stop." The formatting for non-English languages is pretty consistent; for English it's anything but. Some start with capital letters, some don't. Some finish with fullstops (i.e. periods) some don't. Any chance of implementing Visvisa's suggestion in Wiktionary:Votes/pl-2009-03/ELE Amendment 1 from 2009? That is, treating all definitions as sentences. If nothing else, it would enforce consistency. Mglovesfun (talk) 12:15, 9 February 2012 (UTC)

Are you saying that the definition of "xyz" should be "An xyz is a whatever." or that it should be "A whatever." ? SemperBlotto 12:20, 9 February 2012 (UTC)
Sentence format sorry, initial capital letter, final fullstop, even when it's a single word. So Spanish fuego is define as "Fire." Mglovesfun (talk) 12:24, 9 February 2012 (UTC)
OK. If it ever comes to a vote - I'm in favour of free format (whatever the original editor thinks is best at the time). SemperBlotto 12:26, 9 February 2012 (UTC)
That's the status quo, AFAICT. Mglovesfun (talk) 12:30, 9 February 2012 (UTC)
When it comes to definitions, I imagine two different kinds:
  1. Simple "equational" definitions, where you get "definiendum = definiens", which are the norm for foreign-language definitions which give one-word translations or a list of largely synonymous one-word translations punctuated by commata. When I use this form of definition for English terms, I follow the OED in using the 〈=〉 symbol, as in the two senses of inverted hat.
  2. "Full-sentence" definitions, where there is an implied form of "definiendum [means / is / &c.] definiens", which are more-or-less the norm for English definitions which give descriptive glosses (that are usually semantically substitutable for the definiendum) or a number of equivalent descriptive glosses punctuated by semi-cola, and sometimes ending in one or more one-word synonyms (which are doing essentially the same thing as one-word–translation foreign-language definitions). Despite being "full-sentence" definitions, these can be very short, as in the case of senses 1 and 3 of inverted circumflex.
If any form of practice were to be formalised, I'd hope it would be the practice I describe above. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 16:06, 9 February 2012 (UTC)
My practice is fairly similar to yours, but I mostly only use the equals-sign notation for foreign terms, when I'm defining one as basically, "equal to such-and-such other foreign term". (For example, I defined קמ״ש (K.M.Sh., "kph") as
 # ={{term||[[קילומטר|קִילוֹמֶטֶר\־רִים]] [[ל־|לְ־]]\[[ב־|בְּ]][[שעה|שָׁעָה]]|kilometer(s) per hour|lang=he|tr=kilométer(im) l'-/b'sha'á}}: [[kph]] 
  1. =קִילוֹמֶטֶר\־רִים לְ־\בְּשָׁעָה (kilométer(im) l'-/b'sha'á, "kilometer(s) per hour"): kph
.) And EncycloPetey has objected to my doing even that.
RuakhTALK 16:55, 9 February 2012 (UTC)
That's interesting. Without knowing anything about Hebrew, I'd tend not to support that practice. My reasoning is this: English entries and non-English entries have, AFAICT, slightly different purposes. English entries are meant to explain what a word means; in the case of true synonyms, it is therefore appropriate to define one as "= [the other word]" to save unnecessary duplication. In the case of non-English entries, they're meant to give translations; accordingly, any non-English lemma ought to link directly to an English translation, saving any equivalent terms for a Synonyms section. That's my rationale, anyhow. I admit, however, that I mostly work with English terms, and have not thought through all the implications of my stance; in no way do I mean to be dogmatic. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 02:27, 11 February 2012 (UTC)
No, the purpose is exactly the same: describing a word, including its sense(s). The difference is that, for non-English words, it may be easier to provide a definition, because a translation may be sufficient to explain the meaning of the word. But this translation is a definition. Lmaltier 08:55, 11 February 2012 (UTC)
Yes, upon further (less fatigued) reflexion, you're right. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 08:16, 12 February 2012 (UTC)
@Doremítzwr: But it's not an "equivalent term", it's not a "synonym": it's the same term. It's the pronunciation, it's the etymology, it's everything. קמ״ש simply is קִילוֹמֶטֶרִים בְּשָׁעָה. —RuakhTALK 14:21, 11 February 2012 (UTC)
OK, then; shouldn't they be listed in Alternative forms sections, rather than in Synonyms sections? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 08:16, 12 February 2012 (UTC)
I'm not the one who suggested it should be in a Synonyms section. ;-)   But anyway, no: someone looking up קמ״ש will want to see קִילוֹמֶטֶרִים בְּשָׁעָה. If I had to remove one part of the definition or the other, I'd rather remove the "kph" part, because it's easier to figure out "kph" from קִילוֹמֶטֶרִים בְּשָׁעָה than the reverse. —RuakhTALK 14:52, 12 February 2012 (UTC)
Forgive my fuzzy thinking. I'm with you on this one. If קִילוֹמֶטֶרִים בְּשָׁעָה had an entry, I wouldn't support that, but as it doesn't, I think it's a good way to do things. Alternatives could include having that information in an Etymology section or changing the definition to "initialism of קִילוֹמֶטֶר\־רִים לְ־\בְּשָׁעָה (kilométer(im) l'-/b'sha'á, "kilometer(s) per hour"): kph", but I shan't pettifog. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 18:27, 12 February 2012 (UTC)

I don't agree with the use of initial capital letters and full stops in definitions. In most cases definitions do not have a main clause verb, thus they cannot be treated as ordinary sentences. This is more clear in foreign languages entries, where the "definition" is very frequently a single word or a set of words separated by commas. Moreover, I personally find it ugly and annoying, I mean being obligated to use something like [[word|Word]] instead of a plain [[word]]. What if there are two entries, one with a capital initial and one with a lower-case? The reader wouldn't know which one is the correct translation until they click on the link. I don't like the equation symbols either. I could accept them in a glossary, where the gloss comes right after the headword, but here it seems to me ugly and unjustified. --flyax 21:12, 9 February 2012 (UTC)

I've always found current practice very inconsistent. All dictionaries have a consistent presentation for definitions, capitalized or not, with a full stop or not, but they are consistent in the whole dictionary. Don't forget that, even for non-English words, what is provided is a definition, even when this definition is a single word (a definition is an explanation of what the word means, e.g. psychanalyst is a good and sufficient definition for psychanalyste.
fr.wikt use capitalized definitions with full stops, for all words (except where the convention is not applied). On the other hand, nl.wikt does not use full stops, nor capitals. This second option has two advantages:
  • in some cases, the absence of a capital makes the definition clearer, less ambiguous, as mentioned above.
  • the absence of a full stop discourages the addition of encyclopedic details.
A change is really needed, for consistency, and I would favor this second option. Lmaltier 21:39, 9 February 2012 (UTC)
I also strongly favor the second option (no capital and no full stops).Matthias Buchmeier 10:59, 10 February 2012 (UTC)
Me too. --JorisvS 11:07, 10 February 2012 (UTC)
Not ever? Mglovesfun (talk) 11:42, 10 February 2012 (UTC)
In long definitions a punctuation mark somewhere in the middle might be necessary. In these cases we could agree to always use the semi-colon. --flyax 12:02, 10 February 2012 (UTC)
Maybe only allow capital letters and fullstops for multi-sentence definitions. And in partial reply to DCDuring below, not all multi-sentence definitions will be bad one. Mglovesfun (talk) 18:51, 10 February 2012 (UTC)
Definitions may be very long (e.g. for mathematical terms), but I don't think that multi-sentence definitions are needed. I can't find any example. This is a strong clue that unneeded encyclopedic details have been included. Lmaltier 21:28, 10 February 2012 (UTC)
Some definitions in English sections are in the form of clauses with a main verb. Some examples can be found among senses using {{non-gloss definition}}, especially those beginning with "Used". These can be viewed as sentences for which the headword is the subject of the sentence. There are also others with a clause as the main element of structure. Some definitions have other punctuation, such as semi-colons and commas separating main parts.
I don't think that such definitions are as intelligible without initial caps and final period. (I have no more evidence for my opinion than has been advanced for other claims about appearance and intelligibility in this discussion.)
Uniformity of appearance among definitions has been acknowledged by several in this discussion as a desideratum.
The consequence of accepting these propositions is that, if there is to be a single standard appearance for English, it must have initial caps and final period.
It might be nice to enforce a rule of only-one-period-per definition, which might be highly effective for identifying potentially encyclopedic entries, at least until semicolons replace periods among those trying to conceal their encyclopedic works. DCDuring TALK 11:59, 10 February 2012 (UTC)
I agree for only-one-period-per definition (if there is a period). About non-gloss definitions: yes, they are very rare in paper dictionaries, but they are very common here, as they are used for inflected forms. But, as we want to use a different format for them anyway, there may be an exception for them. Lmaltier 09:07, 11 February 2012 (UTC)
I don't mind either about an exception for non-gloss-definitions. However I don't think all of these are ordinary sentences. Statements beginning with a "used to .." are participle clauses the way I see it and inflected form definitions have no verb at all. These definitions should begin with a lower-case letter as well. --flyax 12:33, 11 February 2012 (UTC)

Here are two useful links: (a) Terminology, p.31-35 (b) ISO/IEC Directives Supplement, p 35. I am not implying that we are obligated to follow these instructions just because they've become an ISO standard, I am just giving them for further reading. --flyax 09:33, 11 February 2012 (UTC)

I understand that ISO wants to standardize the use of words, with precise meanings, in their documents, and they are right. We describe the languages as they are used, this is a very different objective. Anyway, I don't see how these documents relate to this discussion. Lmaltier 10:46, 11 February 2012 (UTC)
My intention was to draw our attention on the way ISO wants to format definitions. (a) See in page 31: Definitions shall not: be given in full-sentence form ...; in p. 35: Definitions shall be lower case, including the first letter, except for any upper-case letters required by the normal spelling of a word in running text . (b) See in I.2.2.4.6: ... letters normally appearing in lower case shall remain in lower case (this applies in particular to the first letter of the definition). The definition shall not end with a full stop .... --flyax 12:01, 11 February 2012 (UTC)
I now understand, but they don't want to standardize dictionaries, they want to standardize definitions in their own documents. They are right, it's important. But different dictionaries make different decisions. The decision should be based on arguments. Lmaltier 13:24, 11 February 2012 (UTC)
We all think the same way I think. Reason, arguments, dialectic, personal preferences, stuff to study, all these are necessary. --flyax 14:17, 11 February 2012 (UTC)

I completely support the status quo, that is, I support having full sentences for English definitions and glosses for FL-to-English definitions. The needs of a single-language dictionary are very different from those of a translating dictionary and it doesn't seem strange or inconsistent to me to have a different style for the two cases. Ƿidsiþ 10:01, 15 February 2012 (UTC)

Translations are provided in the Translation section, and definitions in definition lines (# lines). Definitions make senses clear, and translations provide words of the same sense in other languages. I don't see any reason not to apply these principles systematically (keeping in mind that, for foreign words, a translation may be a good, sufficient, definition, but not always). Simple principles make everything simpler. Lmaltier 18:43, 15 February 2012 (UTC)

[edit] Indicating nasalisation in Proto-Germanic entry names

There is a discussion on this right now but I think it needs a bit more input. Please look and contribute if you can? Wiktionary talk:About Proto-Germanic#Indicating nasalisation in entry namesCodeCat 13:27, 10 February 2012 (UTC)

[edit] Diitidaht (Nitinaht - Southern Nootkan)

How can I become a contributor. I would like to enter my Diitidaht dictionary (I have thousands of words) and the language has less than 10 (5) speakers. I also speak Romany (Kalderash Gypsy); Danish and English; some Lushootseed (Straits Salish), some Nootkan, and some Makah (also southern Nootkan). —This comment was unsigned. User:Pakkichipps 02:21, 11 February 2012‎ (UTC)

Welcome. Read the following pages carefully and you'll be fine. Help:How to edit a page, Wiktionary:Tutorial, Wiktionary:What Wiktionary is not, WT:ELE, WT:CFI. Also, remember to sign your edits in discussion pages (just type ~~~~ and it will be converted into a signature). Ungoliant MMDCCLXIV 02:46, 11 February 2012 (UTC)
You will need the language code for Diitidaht, which is dtd. —Stephen (Talk) 06:30, 11 February 2012 (UTC)
... which doesn't exist? —CodeCat 12:17, 11 February 2012 (UTC)
It now does. -- Liliana 13:15, 11 February 2012 (UTC)
It would be a good start to have some agreement about the English name for this language. Neither Wikipedia at w:Ditidaht language nor SIL International use the double "i" in the name. Eclecticology 07:00, 12 February 2012 (UTC)

User modified this to change from Lower Silesian to Silesian German and added two interwikis. Are we happy about this? Mglovesfun (talk) 13:25, 12 February 2012 (UTC)

Not happy. Revert. -- Liliana 13:36, 12 February 2012 (UTC)
Ethnologue [7] calls it Upper Silesian, and mentions it's "Different from Lower Silesian, a dialect of Polish". WP redirects w:Lower Silesian language to w:Silesian German. I don't see any reason to be unhappy about it. Ungoliant MMDCCLXIV 13:38, 12 February 2012 (UTC)
WP also redirects w:Upper Silesian language to the Slavic w:Silesian language, so Ethnologue and WP don't seem to agree on which language is Upper Silesian and which language is Lower Silesian. —Angr 14:04, 13 February 2012 (UTC)
So calling it "Silesian German" is justified, as it avoids confusion. Ungoliant MMDCCLXIV 14:32, 13 February 2012 (UTC)
I liked the pair Upper Silesian vs. Lower Silesian better. -- Liliana 00:33, 14 February 2012 (UTC)
If only it were that simple. But both languages were spoken in Upper Silesia at some point, and since the annexation of Silesia to Poland after WWII there's now a Polish dialect in Lower Silesia as well. It's probably best if we use less ambiguous terms for both {{sli}} and {{szl}}. —Angr 11:49, 14 February 2012 (UTC)
Which ones specifically? I'm open to suggestions. -- Liliana 23:01, 18 February 2012 (UTC)

[edit] MediaWiki 1.19

(Apologies if this message isn't in your language.) The Wikimedia Foundation is planning to upgrade MediaWiki (the software powering this wiki) to its latest version this month. You can help to test it before it is enabled, to avoid disruption and breakage. More information is available in the full announcement. Thank you for your understanding.

Guillaume Paumier, via the Global message delivery system (wrong page? You can fix it.). 14:57, 12 February 2012 (UTC)

[edit] -ty and -ity in European languages

In an annoyingly nonstandard manner, this suffix is represented with or without the "i". Which is etymologically more correct?

Here's the part that needs cleanup once we decide which is to be the form-of and which the real entry:

- Metaknowledge 05:54, 13 February 2012 (UTC)

The OED [2ⁿᵈ ed., 1989] has entries for both "-ty, suffix¹" and "-ity", which I shall quote in full:
  1. "-ty, suffix¹": "denoting quality or condition, representing ME. -tie, -tee, -te (early ME. -teð), from OF. -te (mod.F. -té), earlier -tet (-ted): — L. -itātem, nom. -itās. Such Latin types as bonitātem, feritātem, were in OF. normally reduced to two syllables (bontet, fertet) by elision of the -i- between the two stresses, so that -tet, later -te, became the regular form of the suffix. The final dental still appears in some early adoptions in ME., as plenteð, plenteth plenty (c 1250, in use till c 1600), and is characteristic of the Scottish forms bountith, daintith, and poortith (q.v.). The reduced form -te, however, is found in words recorded from shortly before or after 1200, such as bonte bounty, cruelte cruelty, debonerte debonairness, deinte dainty (n.), plente plenty, poverte poverty, purte purity, and vilte vileness. Among others which appear somewhat later are certeynte certainty, Cristente Christenty, freelte frailty, novelte novelty, and sotelte subtlety. Varying forms of the stem are found in the words now or formerly represented by beauty, fealty, lealty, †lewty, loyalty, †realty, †rialty, and royalty. From the types lealte, realte, the ending -alte (mod.F. -auté) was in OF. extended to formations from different stems, and many words of this form (ultimately written with -alty) established themselves in English, as admiralty, casualty, commonalty, †generalty, mayoralty, †principalty, †regalty, severalty, specialty, spiritualty, temporalty. Most of these date from the 14th or early 15th century; penalty appears to be of later introduction (1512). An obsolete type of formation is exhibited by curiouste, hid(e)ouste, and joyouste. In OF. certain analogies led to the frequent substitution of -ete for -te, but this form of the suffix is only occasionally adopted in English, as in the obsolete noblete, purete, and simplete; the early sauvete is now represented by safety. Under Latin influence many words in OF. also appear with -ite (mod.F. -ité) in place of -(e)te; hence English forms in -ity, which in many cases (as in F.) have supplanted those in -ty. [¶] Although occurring in a large number of words the suffix has shown little productive power in English; evelte, everlastingte, and overte occur in the 14–15th cent., and shrievalty, sheriffalty, have had currency from the beginning of the 16th cent., but such formations are very rare. [¶] Such words as faculty, difficulty, honesty, modesty, puberty, represent Latin formations in which the suffix -tās is directly added to a consonantal stem. The number of these in English, as in French, is very small. [¶] The early form of the suffix (-te, or -tee) remained in use down to the 16th cent., but from the 15th was gradually supplanted by -tie, -tye, and the surviving -ty."
  2. "-ity": "[ME. -ite, a. F. -ité, L. -itāt-em] [¶] the usual form in which the suffix (L. -tās, -tātem, expressing state or condition) appears, the i- being orig. either the stem vowel of the radical (e.g. L. suāvi-tās suavity), or its weakened repr. (e.g. L. puro-, pūri-tās purity), rarely a mere connective (e.g. L. auctōr-i-tās authority; so ME. emperorite, in Vernon MS., St. Ambrose 886). The last became more frequent in med. and mod.L., and the mod. langs., in abstracts from comparatives, as majority, minority, superiority, inferiority, interiority. Hence such formations as egoity, with playful or pedantic nonce-words of Eng. formation, as between-ity, coxcomb-ity, cuppe-ity, table-ity, threadbar-ity, woman-ity (after humani-ty), youthfull-ity. [¶] After i, -ity becomes -ety, as in pie-ty, varie-ty (L. pietātem, varie-tātem). The termination was in L. often added to another adj. suffix, e.g. -āci-, -āli-, -āno-, -āri-, -ārio-, -bili-, -eo-, -idi-, -ido-, -ili-, -īli-, -ino-, -īno-, -io-, -īvo-, -ōci-, -ōso-, -ui-, -uo-, etc., whence the Eng. endings -acity, -ality, -anity, -arity, -ariety, -bility, -eity, -idity, -ility, -inity, -iety, -ivity, -ocity, -osity, -uity, some of which, as -bility (-ability, -ibility) attain almost to the rank of independent suffixes. The earlier popular Fr. form was -eté, in Eng. -ety and -ty, as in safety, bounty, plenty: see -ty."
They seem to treat -ity as merely a concatenation of -i- + -ty, albeit a concatenation far more common than -ty without -i- before it. Might it be worth doing as they seem to do, lemmatising the -ty forms and including redirects defined as "-i- + -ty" (or similar) thereto, with usage notes explaining the relation at the lemma? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 09:35, 13 February 2012 (UTC)
Just to note we allow the acute accent in Old French to represent /e/ at the end of a word, so our Old French entry is bonté not bonte. I seem to think the reasons for this are at Wiktionary talk:About Old French. I don't want to say anymore because I don't want to unwillingly hijack this thread. Mglovesfun (talk) 11:58, 13 February 2012 (UTC)

[edit] Mandarin pinyin with numbers

On User talk:Atitarev#Mandarin with numbers I brought up the issue of keeping or not Mandarin pinyin with numbers as opposed to diacritics. Wiktionary:Votes/2011-07/Pinyin entries says "That a pinyin entry, using the tone-marking diacritics, be allowed whenever we have an entry for a traditional-characters or simplified-characters spelling." No mention of numbers, so they're not protect by the vote. But {{cmn-alt-pinyin}} requires both forms and some of these numbered entries go back years, at least as far back as 2006, so I don't think we should start deleting them outright with no prior discussion. While Wiktionary:Votes/2011-07/Pinyin entries doesn't protect these entries, it doesn't mention them in any way so it's the case that the vote is pronouncing these invalid. Mglovesfun (talk) 12:01, 13 February 2012 (UTC)

Just in case it's not clear, no objection from me to delete all these. I only oppose deleting these with no prior discussion. This is that discussion. Mglovesfun (talk) 12:08, 13 February 2012 (UTC)
I have checked a few pages using {{cmn-alt-pinyin}} and saw only one syllable pinyin with numbers - mai4, kan4. After a second thought, perhaps it's OK to keep one syllable entries with tone numbers (if there are serious objections) but not entries like "dong4wu4". Books which do use tone numbers (increasingly rare) have spaces between syllables, e.g. "dong4 wu4", anyway. --Anatoli (обсудить) 12:16, 13 February 2012 (UTC)
For reference, this discussion concerns entries in Category:Mandarin pinyin with tone numbers, which has 1,473 entries. It seems that great many or all of the entries were created by BD2412 (talkcontribs) in 2006. --Dan Polansky 13:07, 13 February 2012 (UTC)
There's no need for discussion, because Wiktionary:About Sinitic languages#Mandarin addresses this explicitly:
For individual syllables, we have entries in each of these systems, as well as in pinyin with no tones marked at all. For words with multiple syllables, we only have entries for the pinyin romanizations, with tones marked using diacritics.
(citations omitted). If you're aware of any multi-syllable pinyin-with-numerals entries, please list them at RFD so they can be dealt with properly (e.g., moved to the pinyin-with-diacritics title). But single-syllable pinyin-with-numerals entries are absolutely 100% vote-approved, and must be kept.
RuakhTALK 13:24, 13 February 2012 (UTC)
Mglovesfun has restored the monosyllabic entries I deleted (thanks). The polysyllabic ones usually duplicate the existing toned pinyin entries, which we are reformatting according to the vote, so there's no need to rename or fix them, sorry, they just go straight to the bin. If it's not the case, they are renamed and reformatted. We don't support Wade-Giles, Tongyong Pinyin, Yale, Zhuyin Fuhao (Bopomofo) and any other romanisation/transliteration of Mandarin apart from Hanyu Pinyin with tone marks. The language-specific policy ( Wiktionary:About Sinitic languages#Mandarin) is created and maintained by Mandarin speaking editors and there is no need to keep entries, which are not in the proper script and unattestable. Perhaps, the policy on monosyllabic entries should be reviewed but other Sinitic editors should be involved in the discussion. In my opinion, those entries could be converted to soft or hard redirects to toned pinyin entries with all the information. --Anatoli (обсудить) 22:48, 13 February 2012 (UTC)
It's at least worth discussing. It does seem to me even if the versions of the polysyllabic words with numbers shouldn't be speedily deleted, the vote offers no protection for them, so they would have to meet CFI by being attested and idiomatic. So anything that doesn't get any Google Books, Groups or Scholar hits should go. Mglovesfun (talk) 11:01, 14 February 2012 (UTC)
The polysyllabic ones definitely have to go. As for the monosyllabic ones, I am inclined towards deleting them. There is another solution. Either redirect the entire page or if we are not comfortable with this, then make it an alternative form of its diacritic counterpart. I really don't see the point of duplicating the effort. Unlike the tug-of-war between whether to prefer simplified script over traditional (or vice versa), this one is quite clearcut as to which one we prefer, so alt form makes sense in this case. JamesjiaoTC 01:41, 17 February 2012 (UTC)

[edit] Using modifier letters for superscript

A bunch of modifier letters that look like superscript letters were encoded into Unicode for use in various languages and particularly phonetic systems. They were not meant for "generic styling mechanisms for superscripting of text, as for footnotes, mathematical and chemical expressions, and the like." (See http://www.unicode.org/versions/Unicode6.0.0/ch07.pdf ) User:Doremítzwr insists on using them for ordinals, like [8] and for general superscripting like majᵗʸ. Note there's no way automatically uppercase that text, there's no way to automatically search for it unless you know the idiosyncratic means of encoding it, and there's a limited set of characters; I'm not sure if Basic Latin is now covered, but I know most Latin characters outside the basic 26 of English aren't, and only a handful of Cyrillic or Greek. We should be using superscripts for the ordinals (if it's really thought necessary) and we can treat spell majty maj'ty or put it on a page of superscripted abbreviations.--Prosfilaes 12:10, 13 February 2012 (UTC)

Those partially superscript contractions are extremely numerous in older texts, and whilst some of them will occur in both forms (e.g., both majᵗʸ and majty occur), other contractions only occur partially superscript (such as principˡ). Majᵗʸ, majty, and maj'ty all occur; how would you present the first? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 15:11, 13 February 2012 (UTC)
No, we shouldn't be using any kind of superscripts for ordinals, whether "pre-composed" or created by means of html tags. It looks ridiculously old-fashioned. And for dates we shouldn't be using any kind of ordinals. We should be writing "February 10" and "August 14". —Angr 14:21, 13 February 2012 (UTC)
I must take exception to your edit comment "we don't live in the 19th century". In what world do you live? If superscript ordinals are a typographical feature restricted to the nineteenth century, why the hell would Microsoft Word — probably the most popular word processor in the world — autocorrect "1st", "2nd", "3rd", "4th", etc. to "1st", "2nd", "3rd", "4th", etc. by default? And why shouldn't we be using ordinals for dates? With years, "February 10 2012" and "2011 August 14" look wrong. Indeed, most people use ordinals when writing dates. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 15:11, 13 February 2012 (UTC)
I can't see what that last link is to (Google doesn't let me), but in my limited experience most people do not use ordinals (written as such) when writing dates. They write "February 13, 2012" (as the case may be). Is this perhaps a pondian difference?​—msh210 (talk) 19:47, 13 February 2012 (UTC)
Here's the relevant bit, in our citation format:
  • 2012 February, Andrea Jones, All about Level 3 ITQ QCF: Using Microsoft Word 2010 (All About Resources, ISBN 9781908750013), page 23
    Ordinals (1st) with superscript [¶] Most people probably do find this feature useful as they may use ordinals when typing dates (like 1ˢᵗ January 2012).
The author's from Lydbury North and the book was printed in the UK, so that much, at least, is consistent with your hypothesis that the use of ordinal suffixes is a Cisatlantic thing. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 22:45, 13 February 2012 (UTC)
I live in the UK and read a great deal and the superscripts in dates look comically antiquated to me. Equinox 22:51, 13 February 2012 (UTC)
Then we disagree. Clearly, we need the input of style guides on this issue. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 00:15, 14 February 2012 (UTC)
Per two of Prosfilaes's points — one, that Unicode explicitly notes that these characters are not meant as superscripted standard letters for style purposes and, two, that they are hard to search for — I'll have to agree we should not use them for dates in citations or in page titles. (For page titles, we can use the unsuperscripted versions. The headword line can include the superscripted version (or both, as appropriate); or, if the superscripted version is vanishingly rare as compared to the other, then its existence can be relegated to a usage note.)​—msh210 (talk) 19:52, 13 February 2012 (UTC)
Isn't that a problem if we have entries for both majᵗʸ and majty? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 22:45, 13 February 2012 (UTC)
Should we? We don't include the, THE, The, Tʜᴇ, and ᴛʜᴇ: the differences are in style not the word proper.​—msh210 (talk) 23:56, 13 February 2012 (UTC)
I don't think the ᵗʸ in majᵗʸ is merely stylistic — it remains superscript often irrespective of context (such as if everything around it is in all caps). — Raifʻhār Doremítzwr ~ (U · T · C) ~ 00:15, 14 February 2012 (UTC)
I agree wholeheartedly with Prosfilaes and msh210 that these modifier letters should not be used to write superscripts, because they are not intended or suited for that purpose (they are apparently not found by searches for the non-superscript letters); only <sup> and such things should be used on regular characters when it is necessary to write something superscript. - -sche (discuss) 00:56, 14 February 2012 (UTC)
They are found in searches; for example, 1ˢᵗ is the second search result that appears when one searches for 1st. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 01:24, 14 February 2012 (UTC)
It is neat to learn that final letters were often superscripted, though — even superscripted in cases like Principl where almost no space is saved! I saw honour (superscript) in a recaptcha image (i.e. taken from some old book) just yesterday and was confused until now. I would never have searched Wiktionary for honouʳ (modifier), mind you... - -sche (discuss) 01:02, 14 February 2012 (UTC)
Oh, and display as superscript (in headwords, in citations, in {{term}}, even in pagetitles by means of DISPLAYTITLE) can be by means of the HTML sup element.​—msh210 (talk) 19:57, 13 February 2012 (UTC)

How many, and which entries use superscript characters? -- Liliana 00:29, 14 February 2012 (UTC)

There are potentially thousands of entries for obsolete spellings of this kind. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 00:51, 14 February 2012 (UTC)
I'm asking because in chemistry subscript letters are commonly used, like in H₂SO₄. -- Liliana 00:54, 14 February 2012 (UTC)
Well, those are subscript numerals, but they are another example of the legitimate (and irreplaceable) use of these characters. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 00:57, 14 February 2012 (UTC)
We have a not-yet-standardised mix of hard and soft redirects pointing to/from H2O, H2SO4 etc from/to the subscript versions so they can be found. Also, the subscript numbers were probably intended to be used in place of <sub>, unlike modifiers like ʳ, which were explicitly not intended to be used in place of <sup>. - -sche (discuss) 01:06, 14 February 2012 (UTC)

Question: how were things like "majty" and "4h" originally put onto paper? Did book presses and typewriters use dedicated distinct characters, or did they move regular characters around? Obviously, even if they used dedicated separate characters, those characters do not correspond to Unicode's modifier letters, and so we should not misrepresent them by Unicode's modifier letters, but if they just moved regular characters around, there really would seem to be no argument for using dedicated characters here. - -sche (discuss) 05:11, 14 February 2012 (UTC)

God knows. The superscripts are consistently smaller than the regular characters in whose context they appear. Maybe they just used type pieces for smaller font sizes, but I can't tell you with any authority. Whatever the case, physical type pieces and digital characters are disanalogous. With physical type pieces, one must use different bits of metal every time he wishes to change font sizes; the same digital characters are used irrespective of what font size is selected, and each is kept in the same relation of scale to every other. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 23:21, 14 February 2012 (UTC)
Well put, and this is exactly what I thought when I read -sche's comment. Sizes of traditional type don't have a bearing on digital characters. The things we are more interested in are stylised forms like & for et. Equinox 23:27, 14 February 2012 (UTC)
I should probably clarify: I am opposed to using modifier letters for things like majty; I consider the question of whether or not to use ordinals like 14th a separate question; I would prefer not to use ordinals, but I am not as opposed to ordinals as to modifiers. - -sche (discuss) 23:33, 15 February 2012 (UTC)

NB: we currently have some entries which are exclusively modifier-characters, like . - -sche (discuss) 05:11, 14 February 2012 (UTC)

It's clear that special characters should be used only for what they are designed for. Otherwise, it would be like using the Roman letter A in Bulgarian or Russian words because the appearance is exactly the same. Lmaltier 22:29, 15 February 2012 (UTC)
Good point! - -sche (discuss) 23:33, 15 February 2012 (UTC)
Not really. Obviously, it's better to use something tailor made if it's available (in the case of the Cyrillic А vs. the Roman A, it's better to use the former in words otherwise written in Cyrillic, because it causes the word in question to be sorted properly (i.e., alphabetically)), but in the case of these superscript forms, there is nothing tailor made that's available, so we have to make do with something that was designed for another purpose, but which nevertheless does the job just fine. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 04:14, 16 February 2012 (UTC)
Except that it's much more problematic with browsers, systems, and users then st, which is a real issue for the ordinals since there's no functional loss with using st. We should try for consistency, and none of our non-Doremítzwr users have any intention of using these characters in our dates.--Prosfilaes 10:43, 16 February 2012 (UTC)
There is something made to allow the representation of superscripts: the <sup> tags and other things msh210 describes. - -sche (discuss) 20:25, 16 February 2012 (UTC)
<sup> tags cannot be used in page titles. In the main text, <sup> tags cause line-spacing problems. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 03:37, 17 February 2012 (UTC)
They can't be used in the title as displayed in the browser's tab or what-have-you, but they can be used in the top-level header (even though we don't edit that one in the wiki source of the page). (I'm not sure which you meant.)​—msh210 (talk) 01:05, 21 February 2012 (UTC)
By "page title", I mean the text that appears atop a given page (before section zero and the table of contents), e.g., the "homoglyph" in large text atop our page for homoglyph. What do you mean? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 10:58, 21 February 2012 (UTC)
That thing _can_ have superscripts and subscripts.​—msh210 (talk) 18:48, 21 February 2012 (UTC)
Yes, fr:Mme suggests that. Could you show me how, using a page of your choosing as an example? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 19:00, 21 February 2012 (UTC)
See [[User:Msh210 on a public computer]].​—msh210 (talk) 19:19, 21 February 2012 (UTC)
Hmm. Why hasn't this worked? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 21:18, 21 February 2012 (UTC)
Because the thing in the {{DISPLAYTITLE}} and the actual title must be equivalent in the sense that the former (once internal HTML tags are removed) can be used in a URL (or [[link]]) to yield the latter. In the linked-to case, majty as a pagetitle is inequivalent (in that sense) to majᵗʸ.​—msh210 (talk) 23:59, 21 February 2012 (UTC)
OK, thanks; noted. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 13:54, 22 February 2012 (UTC)

The superscripted abbreviations are left-over typographical conventions from the days before Gutenberg. Fortunately, they mostly died out by the end of the 17th century. Paper was expensive in those days, and these abbreviations allowed more text to be put on a page. Entire books have been devoted to the peculiarities of Latin and Greek pæleography. For dates in the ISO format I would use "2011-08-14" and not "2011 August 14" since these were intended to be computer sortable. Putting an ordinal into these looks bizarre. Eclecticology 10:14, 16 February 2012 (UTC)

The ISO format you advocate has problems of potential ambiguity; just as some people write "7ᵗʰ of August 2011" (7-8-2011) and others "August 7ᵗʰ 2011" (8-7-2011), so some people write "2011, August 7ᵗʰ" (2011-8-7) whilst others write "2011, 7ᵗʰ of August" (2011-7-8). We don't need our citations to be computer-sortable, because they're already listed from oldest to most recent, as standard. BTW, it's palæography. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 10:24, 16 February 2012 (UTC)
Typo gratefully acknowledged. -- Ec
Citation needed. As far as I know, every single person who uses 2011-8-7 format uses it in year-month-day format. That's part of why it was chosen as ISO standard format, because it didn't have a conflicting body of usage. Your format has problems, too, as some people will see it as 7?? of August 2011 or 7▉▉ of August 2011.--Prosfilaes 10:43, 16 February 2012 (UTC)
There are many available examples of YYYY DD MM date formatting: [9], [10], [11], [12], [13], [14], [15], [16]. Take especial note of this one which explains the rationale behind the YYYY DD MM order as:
  • 1999, Twin Plant News: TP. (Nibbe, Hernandez and Associates), volume 14, issues 7–12, page unknown
    YYYY-DD-MM or the year followed by day followed by month separated either by a dash or a slash. The logic for this standard is very simple…start with the largest number and then write the next largest number and so on. The year is the largest number after which a day which can be up to 31, after which the month which can be up to 12.
Encoding problems are never long-term problems, and in the meantime, boxes and such will not introduce ambiguity. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 03:37, 17 February 2012 (UTC)
You can develop a rationale for anything, including the one quoted, but that doesn't change the international standard. Eclecticology 08:51, 17 February 2012 (UTC)
No, certainly, but that wasn't my point. Prosfilaes didn't believe me that some people use YYYY DD MM date formatting, so I provided evidence that people do; the quoted rationale was just to show why some people would consider such a format to be intuitive. I agree that either YYYY MM DD or DD MM YYYY makes most sense, but that doesn't stop people misinterpreting the month number for the day number and vice versa when the date is anywhen between the 1ˢᵗ and the 12ᵗʰ of a given month (which is the case for approximately ⅖ of all dates). — Raifʻhār Doremítzwr ~ (U · T · C) ~ 09:37, 17 February 2012 (UTC)

Note that, for page titles, this is the same kind of issue than italics (e.g. in animal scientific names). There is a solution used by fr.wikt (e.g. see fr:Mme: the title is Mme without using special letters). However, this solution cannot work if we want to create both Mme and Mme, or Canis and Canis, in different pages. The solution is to consider that, for technical reasons, page titles don't take superscripts, italics, etc. into account, and that all such variations are addressed in the same page. This is a perfectly reasonable and sound solution, and it's easy to understand it. Lmaltier 07:01, 17 February 2012 (UTC)

But why, when there's no need for us to be limited like that with our page titles? And by the same logic, why don't we strip all our page titles of diacritics and non-ASCII characters? That would make them a whole lot easier to search for using an ordinary keyboard. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 07:26, 17 February 2012 (UTC)
Hold on, since when did we use italics in page titles? It's possible (cf. canis and 𝑐𝑎𝑛𝑖𝑠), but why would you do this? -- Liliana 07:40, 17 February 2012 (UTC)
I think he's thinking of Wikipedia, where they italicise the page titles for species names and such. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 07:46, 17 February 2012 (UTC)
Not Wikipedia, but the international convention that a genus (or a species, or any taxon below the genus) must be written in italics.
About special letters: they must be used in titles if (and only if) they are used in the language, it's very simple. And these letters are not used in English. In majty, the t is a normal t, the y is a normal y, they just happen to be smaller and written higher. If we don't use the Roman letter A in Bulgarian words, it's not because of the alphabetical order, it's because it would be wrong: the Roman letter, the Cyrillic letter and the Greek letter are three different letters despite their common appearance. It's exactly the same here. Lmaltier 07:18, 18 February 2012 (UTC)
I just created an entry for the French contraction  ("monsieur"), which is unambiguously attested in Usenet sources in employing the MODIFIER LETTER SMALL R. Should French print sources published before June 1993 (the date of the introduction of U+02B3) count towards its antedating? Or further, should French print sources published prior to the invention of digital computers count towards its antedating? Examples of a contraction taking the form of a majuscule em followed by a superscript minuscule ar certainly exist in such print sources. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 09:03, 18 February 2012 (UTC)
The normal abbreviation is M. but, you are right, Mr is attested. However, the entry you created, , is not attested, as the ʳ letter does not exist in French, it is 'never used in French. ~~
And Unicode is very clear (see document mentioned above): these letters are modifier letters, and they cannot be used for normal subscripted letters. Lmaltier 11:34, 18 February 2012 (UTC)
Did you even look at the entry? It has five citations (two are by the same guy, but that's still four independent ones), which disproves your assertion that "ʳ…is 'never used in French." — Raifʻhār Doremítzwr ~ (U · T · C) ~ 12:52, 18 February 2012 (UTC)
Lmaltier, despite your point being right, we aren't much better sometimes. Many of the minority languages of Russia use capital I instead of the palochka Ӏ, technically Unicode considers this practice illegal, and by your logic we should move all these entries to the spellings with palochka. -- Liliana 13:47, 18 February 2012 (UTC)
The [[Ӏ] page states that the Roman I is in standard use (despite Unicode) in some language for technical reasons (keyboards). In such a case, both pages are probably acceptable (I created myself pages for town names with a bad typography for the capital (E instead of capital é) because the bad typography is very common, probably more common that the right one). But, of course, it's not the case for modifier letters such as ʳ (using the right r is much easier). Lmaltier 17:57, 18 February 2012 (UTC)
If the town names you're talking about are French, you should note that French orthography traditionally omits diacritics from atop letters when they are capitalised (though such omission is non-standard in Québecois French).
I don't think ease of entry is a valid criterion here. The examples of I cited are in a medium that does not permit superscribing by any other method than by the use of characters like 〈ʳ〉. Given a more flexible medium, such as Microsoft Word, most people will use such a program's superscript function (equivalent to using <sup> tags here); but we don't have that flexibility in our page titles and the use of <sup> generally is problematic, which makes our medium more similar to Usenet than to Word. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 21:11, 19 February 2012 (UTC)
I think there is no difference between countries. If you look at town halls in France, you'll read LIBERTÉ, ÉGALITÉ, FRATERNITÉ, and this has always been the normal typography. But this character É is absent from typewriter and computer keyboards. Lmaltier 21:22, 19 February 2012 (UTC)
I'd read that diacritics are omitted from atop majuscules because otherwise maximal letter height would be exceeded. Perhaps my source and I are wrong, however. Still, your explanation of such commonplace omission as being caused by the "character É [being] absent from typewriter and computer keyboards" is implausible, because 〈É〉's absence would also lead to the omission of the acute accent from the minuscule 〈é〉, which I assume does not occur with anywhere near the same frequency; furthermore, whereas 〈é〉 can be generated by a simple shortcut like Alt Gr + E, 〈É〉 can be generated by a comparably simple shortcut, namely Alt Gr + Shift + E. Unequal ease of entry using typewriters and/or computer keyboards seems not to explain this phenomenon. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 10:56, 20 February 2012 (UTC)
The letters é, è, à, ù, ç are present on all AZERTY keyboards (including mine), of course... You could not do without them. But not the capitalized versions. Lmaltier 22:04, 20 February 2012 (UTC)
Aah, how interesting! I was not aware of AZERTY keyboards. Yes, that would probably explain the frequency of omission. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 10:58, 21 February 2012 (UTC)
Of course it's true that "Examples of a contraction taking the form of a majuscule em followed by a superscript minuscule ar certainly exist in such print sources." That is, a superscript r, not a modifier letter r.--Prosfilaes 14:32, 18 February 2012 (UTC)
Of course, this is what I mean. I repeat that the modifier letter ʳ does not exist in French, it's never used in French. The character representing it might have been used in a few cases, and you found a few examples, but certain not the modifier letter (most probably the authors don't know what "modifier letter" means, they used the character because it looked more or less right, although not quite). A few years ago, I created many Bulgarian first names by bot (on fr.wikt), and I used a Roman a instead of a Cyrillic a in a number of cases. The mistake has been fixed, but would you have used such mistakes as a rationale for creating here these first names with a Roman letter a? Lmaltier 17:49, 18 February 2012 (UTC)
Right, so some people have used these superscripts for what they look like, namely superscripts. Consider the perspective of a typesetter working before digitisation. Perhaps he needs to print some Russian words in an otherwise-English context. Do you think he'd bother to have two different bits of metal — one for the Roman A and another for the Cyrillic А? It would surely be cheaper just to use the Roman A in all cases. Or what if he mixed up the Roman A with the Cyrillic А — Would that mean that every word in Roman type that seemed to use a Roman A actually misused a Cyrillic А? Even if you answer "yes" to the second question, how can you possibly know, if the two look identical? It would surely be a fetishisation of the intended use of whatever bit of metal was used to print the letter. In the case of superscripts, the fact that the bits of metal that were used to print them could also have been used to print ordinary letters in smaller font sizes is as inconsequential as whether a Roman A and a Cyrillic А were in fact printed using the same bit of metal. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 21:11, 19 February 2012 (UTC)
Of course, on paper, there is no difference, and which character has been used is irrelevant. But not here, we are not paper. Furthermore, in the present case, they don't look exactly the same. The page titles you propose are wrong. Lmaltier 21:22, 19 February 2012 (UTC)
Conversely, "majᵗʸ" has a more correct appearance than "majty". In "majty", the superscripts are too big, too high, and cause line spacing problems, whereas in "majᵗʸ" they are the right size, are at the right level, and have no effect on line spacing. Furthermore, "majᵗʸ" italicised as majᵗʸ has a correct appearance, whereas italicising "majty" as majty causes the 〈t〉 to appear on top of the 〈j〉. In terms of functional fit (i.e., using characters for their appearance), these hard-coded superscripts do a better job of representing superscribed characters than using <sup> tags does. I maintain that such functional fit matters more than Unicode-intended purpose. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 10:56, 20 February 2012 (UTC)
The modifier letters don't appear at all in some fonts/browsers, except as boxes. The relentless march of progress is resulting in both display problems being fixed for more and more people, but I don't think we can tell which problem will be fixed first. So, those two arguments ("modifiers are bad because they're boxes for some people" and "sup is bad because it breaks in italics") may cancel out, IMO. - -sche (discuss) 01:17, 21 February 2012 (UTC)
Whereas boxes are unequivocally seen as a display problem to be fixed, I don't think that the problems with <sup> tags are even recognised. Howbeit, I have just discovered that combining <small> tags with <sup> tags generates superscripts of the correct size and height; for example, "1<small><sup>st</sup></small>", "2<small><sup>nd</sup></small>", "3<small><sup>rd</sup></small>", "4<small><sup>th</sup></small>" generates: "1st", "2nd", "3rd", "4th". They still cause line-spacing problems and are positioned too far to the left when italicised, but this new-found functionality is enough to make me drop my instance that we use the hard-coded superscripts. I now advocate only that we use those hard-coded superscripts to allow us to distinguish page titles à la [[majty]] vs. [[majᵗʸ]]. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 10:58, 21 February 2012 (UTC)
┌─────────────────────────────────┘
With regard to pagenames — pagenames = the things that exist in place of xz in http://en.wiktionary.org/wiki/xz and [[xz]] — I'm not convinced we should distinguish "majty" and "majty". I agree with msh210's point, above, that this is like "THE", "THE" etc. I'm generally in favor of including as much information as possible on a page, so if "a" is usually italicized in mathematical equations (which it may not be, I'm just making up an example) or "ty" is usually superscript in "majty", I strongly agree that we should convey this on the page. I just now added a usage note to "LORD" to explain that it is commonly written "LORD". I'm not as insistent as you (Raifʻhār) that we convey such typographical features in the headword line, but I definitely want them mentioned in usage notes or sense-line qualifiers. I think the pagenames should be "LORD", "majty" etc, however. (I accept pagenames like "H₂O" because we redirect to them, but my favoured solution for that, too, would be "H20" as the pagename/URL and "H₂O" as the thing displayed everywhere on the page. But I'm not going to press for that.) In part this is to combat Wiktionary's proliferation of content onto multiple pages; surely "majty" is the same word when typed "majty" on Usenet and when written with superscript letters in an old book, so I don't think we need separate entries for the typographical variation. Having the same pagename may, in the event one language has a word "majty" that's written with superscript letters and another has a word "majty" that is never written with superscript letters, also mean we can't have superscript pagetitles (pagetitle = the part of [[User:Msh210 on a public computer]] that currently displays "user: msh210 public"), but because I expect most "majty"-words are also written "majty" sometimes, I don't see it as a problem to use "majty" as the pagetitle/header (and pagename/URL) and only have a usage note mention "majty". - -sche (discuss) 22:03, 21 February 2012 (UTC)
majties-temp.png
By the way, even when the Unicode characters display, they sometimes display in an unschön way (no better or worse than italicized <sup>-letters). Note the "i" raised above the "t" and "es" in the image to the right. - -sche (discuss) 22:53, 21 February 2012 (UTC)
Hmm. What do you think of princip l? — Raifʻhār Doremítzwr ~ (U · T · C) ~ 13:54, 22 February 2012 (UTC)
I created Wiktionary:Votes/pl-2012-02/Using modifier letters for superscript as a possible vote on the subject. Let's all discuss and boldly modify it. As it's set up now, if we cannot get consensus for one option or the other, the unregulated status quo continues. (I feel strongly that whether or not to use ordinals — whether "14th" or some kind of superscript — needs to be a separate vote, although if this vote determines that one or the other method of effecting superscript should be used, that will be binding also on any superscript ordinals.) - -sche (discuss) 22:29, 19 February 2012 (UTC)

[edit] CFI and company names

I have created Wiktionary:Votes/pl-2012-02/CFI_and_company_names, which proposes removing the section dedicated to company names from WT:CFI.

If any discussion that results lasts longer than to the beginning of the vote (which is 20 February 2012), feel free to postpone the vote.

A poll relevant to the vote: Wiktionary:Beer_parlour_archive/2011/April#Poll:_Including_company_names.

I emphasize that removing the section does not lead to inclusion of any and all company names. Rather, after removing, the inclusion of company names would be governed by the section on the names of specific entities, just like names of literary works such as Much Ado About Nothing. --Dan Polansky 15:20, 13 February 2012 (UTC)

Great idea, thanks for having the initiative to start this. :) -- Cirt (talk) 23:55, 13 February 2012 (UTC)
As a practical matter, how has the specific-entities rule been applied so far? I assume that some editors have been adding them like crazy, while other editors slowly (or not-so-slowly) list them at RFD? With a specific consensus being required for deletion, but not for creation? —RuakhTALK 18:06, 14 February 2012 (UTC)
After the removal of attributive-use rule ("A name should be included if it is used attributively, with a widely understood meaning"), which took place in Wiktionary:Votes/pl-2010-05/Names of specific entities, I have seen no editors add names of specific entities like crazy, but I'll stand corrected. Daniel Carrero was adding some names of dubious lexicographical value (IMHO anyway) some time ago, but these were no company names, and he has already stopped. I have recently added a fairly small batch of Czech geographic names, ones that topped a frequency list. Specifically, I have seen no flood of geographic names that was feared by some of the opposers of broad inclusion of geographic names.
In RFD, consensus is required for deletion; that's right. I admit that this creates a pro-keeping bias, as consensus is required for deletion rather than for creation. Wikipedia's w:WP:AfD has the same pro-keeping bias, it seems. The same pro-keeping bias pertains to discussions of idiomacity in RFD; the bias is specific to RFD rather than to company names. --Dan Polansky 07:58, 15 February 2012 (UTC)

[edit] DICTIONARY FOR BRAZILIAN INDIGENOUS LANGUAGES

Hello, My name is Rodrigo Cotrim. I'm a linguistic professor in Brazil and I've been working with indigenous languages spoken nowadays in Brazil (13 Brazilian languages from 180 existing ones). I would like to make a request to create a dictionary for at list one of those languages I'm working with. It would help me and my indigenous students to make a word list/glossary/vocabulary/dictionary/thesaurus of their mother tongue (L1) (and of their second language (L2), Brazilian Portuguese). This dictionary would help to expand the scientific knowledge upon an endangered language spoken in Brazil. It would also help my indigenous students (many of whom are also indigenous teachers at their villages) in their schools, since the Brazilian government has been implanted computers and INTERNET at public schools located in indigenous villages. Could someone help us? My students and I will be really thankful and we are really looking for an answer. Sincerely, Rodrigo Cotrim (Professor at Federal University of Goiás, Goiânia, Brazil)—This unsigned comment was added by Rodrigo Smisuite (talkcontribs).

Such words are certainly welcome here as entries (though the "definitions" are English translations); see template:welcome for basic information about how things work around here, and feel free to ask here (or, better, at WT:ID) any further questions you have.​—msh210 (talk) 02:39, 15 February 2012 (UTC)
We have some that you can look at. This should act as a guide for you: Category:Guaraní language. —Stephen (Talk) 03:49, 15 February 2012 (UTC)
Also note that there is a Portuguese-language Wiktionary, where the glosses are written in Portuguese and the administration of the Wiktionary is discussed in Portuguese. You and your students may prefer to create your dictionary there, so that glosses and communication can be conducted in that language rather than English. Of course your entries are welcome at English Wiktionary too! But if you prefer using Portuguese, you should be aware that there is that option. —Angr 10:27, 15 February 2012 (UTC)

I'm not 100% happy with this proposal, but I think it's an improvement over the status quo.

Things I'm not so happy with:

  • What about multiple quotations from a small group, such as a single Usenet group? Should they be counted as independent?
  • I don't like the broadness of "anything like the following", but I also didn't want to try to microscopically define all corner cases.

Input or improvements on these points, or on any other, would be welcome.

RuakhTALK 20:22, 15 February 2012 (UTC)

Hm, I don't know if this is a good idea. I like requiring independence of citations as a general principle for what makes something a real word, but it doesn't translate well into an actual usable firm rule that doesn't break certain things. I'm not sure if the proposed replacement is an improvement. --Yair rand 20:33, 15 February 2012 (UTC)
So, what would you suggest instead? —RuakhTALK 20:54, 15 February 2012 (UTC)
Well, taking in to account that a proposal needs community consensus, I would just leave the section the way it is. In a situation where I find myself appointed Supreme Dictator of Wiktionary, I would probably change it to something horribly ambiguous, and leave relevant decisions to whoever happens across the relevant RFV or RFD and can get enough people to agree that "that's is/isn't really independent...", and win the inevitable new argument about what independence means, which can be repeated every time the situation pops up (thus producing all sorts of interesting examples and arguments which might be useful in drafting a potential new policy), sort of like what we do with noun/proper noun designations. :P --Yair rand 21:07, 15 February 2012 (UTC)
Ah. The current section is so bad that I guess I just don't see leaving-it-the-way-it-is as an option. :-P   The key problem, by the way, isn't that it's vague (which I assume is what you mean by "ambiguous"), but that it's contradictory: it proposes a specific rule, giving non-durably-archived examples, and then explains that the rationale is something completely unrelated. Obviously I'd prefer a guideline that's actually usable, but failing that, we need to fix the current text somehow. (You complain about the difficulty of getting a rule "that doesn't break certain things", but the current text already is broken . . .) —RuakhTALK 21:30, 15 February 2012 (UTC)
The current text certainly has significant problems, but it has the advantage of being very open to community interpretation. The only real statement in that section (excluding the last sentence, which we generally just don't listen to) is that we want to exclude multiple references/uses that draw on each other. The proposed version actually gives specific points about what that means. A famous quote that becomes an idiom (ex. et tu, Brute) could have an issue with this, as every use of it technically is a verbatim quotation. --Yair rand 22:36, 15 February 2012 (UTC)
For the most part, I like your (Ruakh's) proposal. Where I see a problem is in the italicized part (by me) of "This serves to exclude uses that draw from each other, or that draw from a common source": "draw" is so broad that specialist uses of a shared term that can be traced to a common source might be considered dependent; an example would be speciesism, I think, which can be traced to Richard D. Ryder from 1973 if one believes Wikipedia. -Dan Polansky 21:34, 15 February 2012 (UTC)
Most uses of a word (at least of an invented word) ultimately originate from a common source. The text should make clear that uses of a word in different sentences written by different people always are independent citations, whatever this word is. Lmaltier 21:48, 15 February 2012 (UTC)
A proposed edit: "In particular, two uses are non-independent if (but not only if) anything like the following is true:". This maybe what was intended. As a consequence, the rule would be more explicitly open-ended. --Dan Polansky 21:38, 15 February 2012 (UTC)
What are the other possibilities? How about instead of "(but not only if)" we add another bullet point with "if consensus of the Wiktionary community finds it to be non-independent" or some such? Pengo 22:38, 15 February 2012 (UTC)
The word "if" is often read as "if and only if". This was the reading many editors applied to "if" in "A name should be included if it is used attributively, with a widely understood meaning". The same reading is usually applied to "This in turn leads to the somewhat more formal guideline of including a term if it is attested and idiomatic": a term should be included <=> the term is attested and idiomatic. --Dan Polansky 22:49, 15 February 2012 (UTC)
When someone misreads "if" as "if and only if" their error should not be treated as correct. No syllogism is bidirectional unless that is clearly specified. Eclecticology 09:40, 16 February 2012 (UTC)
  • I've updated the proposed text to steal Lmaltier's explanation of "independent", almost verbatim; to eliminate the vagueness that Dan Polanksy points out in "draw"; and to plop in a "roughly speaking" and "generally" in acknowledgement of Yair rand's point (though I'm sure he won't consider it nearly enough). The "roughly speaking" and "generally" hopefully also address the point that Dan Polansky was making about how "if" is often taken to mean "if and only if". (Another possibility is to insert a "say" or "for example". I'm not a fan of the "if (but not only if)" wording, though; for some reason, even though it gets several thousand b.g.c. hits, it sounds very strange to me.) —RuakhTALK 00:54, 16 February 2012 (UTC)
    I can't find any problem with this revision, the latest one. (I have made a small fix to the vote.) Because the second sentence and the bullet points are introduced by "Roughly speaking", this gives some flexibility. Great job! --Dan Polansky 06:51, 16 February 2012 (UTC)
  • Thank you! —RuakhTALK 14:42, 16 February 2012 (UTC)

[edit] Votes to change CFI

In addition to the vote Ruakh has set up (on Independence, see the section just above this) and the vote Dan has set up (on company names, two sections up), Liliana has set up a vote for Removing "Vandalism" and "Protologisms" sections of CFI pursuant to October's straw poll, and I have set up one vote to make small changes concerning Patronymics and stylistic edits of CFI and another to remove the section on Attestation vs the slippery slope, both also inspired by the results of the straw poll and other past discussions. Woo, voting. (Other bits of CFI the community expressed an interest in re-examining, but concerning which no vote has yet been set up, including Idiomaticity, Natural Languages, Constructed Languages, Brand Names, Names of Specific Entities.) - -sche (discuss) 23:26, 15 February 2012 (UTC)

Dan Polansky suggests on the talk page that we could link the key words in our general rule to the sections of CFI that define them (like <tt>[[#Attestation|attested]]</tt>) rather than putting them in bold and linking to the main namespace. Please comment here or on the talk page if you have a preference for one idea or the other. Also, WT:CFI currently uses a mix of curly (""') and straight (""') quotation marks and apostrophes; please also comment if you have a preference for one of those or the other. :) - -sche (discuss) 19:27, 16 February 2012 (UTC)

[edit] Being paid to write Wiktionary entries

I have been told by email that one of our contributors (User:Boundlesslearning) is being paid (by an e-learning company) to write articles for us. Notwithstanding that his contributions have been of poor quality (where they wern't just sum-of-parts), is this acceptable? SemperBlotto 09:07, 16 February 2012 (UTC)

As long as it is understood by both the company doing the paying and the user doing the editing that they both waive any property claims over the latter's contributions hereto, I don't suppose it really makes any difference to us. That being said, Boundlesslearning's wage gives him an ulterior motive for editing here; consequently, we are thereby justified in assuming bad faith on his part if the quality of his contributions does not improve rapidly. To put it bluntly, if he's getting paid to edit here, he'd better make sure his contributions are worth having, and that he isn't just adding mess that has to be cleaned up by the unpaid volunteers who make up the vast majority of the editing community here. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 09:49, 16 February 2012 (UTC)
Agreed. I would like to know why they were having him edit. The essence of the problem on Wikipedia is that those paid to edit are dismotivated to follow NPOV. If he actually has reasons to improve the dictionary, then it's a good thing; if he's here to spam, it's not.--Prosfilaes 10:55, 16 February 2012 (UTC)
Very worrying. They will most certainly have an ulterior motive of spamming their techniques, technologies, etc. even if they aren't so direct as to do it with hyperlinks. Wikipedia has policies about this, as it has been far more of a problem there; does anyone know what they are? Equinox 09:58, 16 February 2012 (UTC)
If party A wants to pay party B, it's beyond our remit to interfere. What we can 'interfere' with is the contributions of individual editors. If an editor is vandalistic or consistently makes bad but non-vandalistic edits, we should block them. Having witnessed Boundlesslearning's edits a block seems very much appropriate. Mglovesfun (talk) 11:17, 16 February 2012 (UTC)
Are you sure? He doesn't seem to edit frequently enough to be paid. We should watch carefully the external links he adds, but I haven't noticed any POV yet. Ungoliant MMDCCLXIV 14:16, 16 February 2012 (UTC)
I think it would be somewhat harder to insert POV into dictionary entries than in encyclopedia entries. My concern with a paid editor here would only be with the quality of their work and their conformance to the CFI, not with their bias for or against a particular viewpoint. bd2412 T 14:24, 16 February 2012 (UTC)
I think the burning question here should be "how can we get paid to write Wiktionary entries?" --Itkilledthecat 14:19, 16 February 2012 (UTC)
WF - You'll get your reward in Heaven (or possibly the other place). SemperBlotto 15:18, 16 February 2012 (UTC)
I wouldn't mind getting paid to do legitimate Wiktionary work. My biggest question about someone getting paid would be: "Why not me?"
I would wonder about the credibility of charges that someone was being paid, as such charges could be leveled at anyone. We are not really in a position to investigate such charges.
Such paid work could be both legitimate for Wiktionary and in a payer's interest under various circumstances.
  1. If an industry association or trade union wanted to make available the technical terms of its industry in hopes of getting someone to translate them into other languages, we might object to flooding by {{trreq}}, but we should welcome the addition of perhaps obscure entries, subject to our usual standards for inclusion, such as they are.
  2. If some national government payed for the entry of words in a recognized language, would we object? Should we?
  3. If some tourist board employee entered all the locations within its remit, could we object? DCDuring TALK 19:10, 16 February 2012 (UTC)
That's what I should have said. What matters here is the entries, not who created them or why. Mglovesfun (talk) 19:21, 16 February 2012 (UTC)
Exactly. Motivations are not relevant (after all, everyone here must have one's own motivations), provided that what is done improves the Wiktionary. Lmaltier 20:10, 16 February 2012 (UTC)
A less legitimate rationale: it might happen that people get paid for introducing a very large number of (not too obvious) copyright violations with the end of the project as the ultimate objective. Lmaltier 20:23, 16 February 2012 (UTC)
I agree with DCDuring and Lmaltier, if someone is paid to edit Wiktionary that can be OK, as long as their edits are good. (Re Lmaltier's second comment: even a volunteer could introduce a large number of copyvios, as User:Primetime did.) - -sche (discuss) 08:17, 17 February 2012 (UTC)
The edits seem fine to me. Whatever happened to "assume good faith"? I studied biotechnology and many of the contributions are common terms I recognise, and have perfectly reasonable definitions. Pengo 01:54, 17 February 2012 (UTC)
That is because every single one of them has been cleaned up by another user. SemperBlotto 07:59, 17 February 2012 (UTC)
If someone wants to pay me for editing Wiktionary, please let me know :). On the relevant note, I see no problem with being paid per se; the contributions should be judged on their own. --Dan Polansky 08:27, 17 February 2012 (UTC)
Note. This user is now operating under the name of User:Scienceexplorer (confirmation via email). SemperBlotto 11:17, 22 February 2012 (UTC)
THe issue I see here is, the decision (of paying these contributors) was made unilaterally by an external business who has no direct influence over any Wikimedia site without consulting with anyone from Wikimedia. So there is next to no understanding or communication of their (ulterior) motive in making this initiative. They also made no effort in understanding the existing standards and conventions used in any given Wikimedia site, before devoting their money in unexperienced editors. Besides, I have seen these people's (yep, I suspect there is more than one person involved) edits and their quality is no way near as good as the quality of the contributions of the amateur (or professional in some cases) lexicographers on this dictionary website. JamesjiaoTC 11:40, 22 February 2012 (UTC)
  • I looked at 10 or so of today's contributions from User:Scienceexplorer. They seemed reasonable well formatted and well worded. I have challenged three that seemed SoP to me, but not everyone agrees with my nominations to RfD. The contributor may not be sensitive to matters like whether an NP headed by a word (protein) that is both countable and uncountable isn't also both uncountable and countable. It would be nice to see at least one citation. Even for the SoP terms, I see no reason for them not be in an glossary-type appendix and/or redirects either to another headword or to the appendix. IOW, this seems like better than average specialized content. If the person is getting paid, s/he has plenty of incentive to learn our approach and apparently has. DCDuring TALK 17:28, 22 February 2012 (UTC)

[edit] Created category, Freedom of speech and en:Freedom of speech

Created new category, for Freedom of speech. This is in conjunction with crosswiki sister project coordination at Commons:Category:Freedom of speech. Please feel free to help populate it, that'd be most appreciated. ;) Cheers, -- Cirt (talk) 06:13, 17 February 2012 (UTC)

I find it an excessively narrow topical category. DCDuring TALK 12:04, 17 February 2012 (UTC)
As do I.​—msh210 (talk) 19:27, 20 February 2012 (UTC)

Not sure if this is a purely technical issue and belongs to Wiktionary:Grease pit but I have created three entries for Arabic diacritics but the next/previous buttons show something else and the red links suggest unsupported titles. Does Wiktionary fully support Arabic diacritics? As you see the headers for the entries are better used in combination with ـ (taṭwīl/kashida - the elongation symbol). What's the best way to create these entries? Do they belong to unsupported titles? Trying to show links to the new entries here: َ‎, ِ‎ and ُ --Anatoli (обсудить) 04:11, 20 February 2012 (UTC)

Hmm, I can't get to the link to these three entries on my contributions list (currently using Windows XP, Firefox browser). Can see the symbols but no link. --Anatoli (обсудить) 04:26, 20 February 2012 (UTC)
I don't have any difficulty opening َ or ِ or ُ. —Stephen (Talk) 10:06, 20 February 2012 (UTC)
Thanks, Stephen. Now using Windows 7 - my home computer, which also has Arabic support installed. I don't see the links to the entries at all. If I open Category:Arabic diacritical marks, I only see three bullet points. I can only see the symbols (over or under |) in the edit mode while typing this reply. I don't understand what's going on. --Anatoli (обсудить) 11:09, 20 February 2012 (UTC)
I don't know, either. I am using WinXP Pro and Firefox 10, and for me it's no problem. I can open the entries and I can see them in the Category page. —Stephen (Talk) 11:21, 20 February 2012 (UTC)
I can see them just fine on Windows XP and Opera 11. They display the dotted circle similar to other scripts. -- Liliana 13:57, 20 February 2012 (UTC)

For reference: one‎, two, and three.​—msh210 (talk) 19:22, 20 February 2012 (UTC)

I don't see the links on Firefox 10 in Linux Mint 11 but I do see msh210's links. —CodeCat 19:37, 20 February 2012 (UTC)

I don't see Anatoli's links (Firefox 10, Windows 7), except when editing the page to write this, where they appear over lines, as he describes. I don't see links in the category, either, only bullet points. I do see the characters once I reach the page via msh210's links. Perhaps this is another good example of the need for combining characters to be combined with something. (The combining-character-only pagetitles could certainly redirect to composed forms, or the composed forms could redirect to combining forms.) - -sche (discuss) 20:45, 20 February 2012 (UTC)
For now, I'll make redirects - ـَ‎, ـُ‎ and ـِ‎ and others later. I can see the links on my work computer (Windows XP, Firefox 5 but not on my home laptop - Windows 7, Firefox 5). The results with other browsers, systems may be unexpected. Perhaps need to check some other similar examples where a dacritic can only work in combination with something, like -sche suggested. --Anatoli (обсудить) 22:14, 20 February 2012 (UTC)

Is there a page where I can see all flags? And where is the correct place to discuss about them (inclusion, change, etc.)? Ungoliant MMDCCLXIV 20:15, 20 February 2012 (UTC)

*see*? The code for the flags is stored in MediaWiki:Gadget-WiktCountryFlags.css. If you want to discuss anything, do so here I guess. -- Liliana 20:25, 20 February 2012 (UTC)
Thank you. Ungoliant MMDCCLXIV 20:49, 20 February 2012 (UTC)
By the way, the flag for !Xóõ isn't working. Bloody enconding. Ungoliant MMDCCLXIV 02:15, 27 February 2012 (UTC)
Bleh. No idea how to get that to work. -- Liliana 02:38, 27 February 2012 (UTC)
I think this MediaWiki behavior is a bug. Neither HTML nor XML allows attributes of type ID to start with ., so the encoding of !Xóõ as .21X.C3.B3.C3.B5 is invalid. —RuakhTALK 03:34, 27 February 2012 (UTC)
In that case, shouldn't it be reported to mediazilla:? -- Liliana 03:42, 27 February 2012 (UTC)
Adding a \ before each . should work, I think. --Yair rand (talk) 03:45, 27 February 2012 (UTC)

[edit] Rhymes by dialect in Catalan (but possibly other languages too)

The current way of categorising rhymes in Catalan is by using the standard Central Catalan dialect of Catalonia, which is the best-known standard for Catalan. However, there are other dialects, some with their own standard, notably Valencian and Balearic. The problem is that these dialects distinguish certain phonemes that Central Catalan doesn't, especially in unstressed syllables. In Central Catalan, unstressed a and e are pronounced the same (as schwa), as are unstressed o and u (as u), so words ending with those vowels (optionally followed by more sounds) rhyme in Central Catalan whereas they don't rhyme in Valencian. But in Central Catalan words containing stressed ɔ, this is often merged with o in Valencian, so that for example dónes and dones sound alike in Valencian but not in Central Catalan. The same situation occurs with ɛ and e, but Balearic has a third e-like phoneme, stressed ə. I'm wondering how this situation can be solved, seen as currently certain rhymes are thrown together for the sake of Central Catalan while such mergers are inappropriate for Valencian speakers. Should the categories be split so that both dialects are represented, with a footnote that for example words ending in -os rhyme with those in -us in Central Catalan? And what about Balearic, a dialect that has fairly few speakers and even less contributors... —CodeCat 00:49, 23 February 2012 (UTC)

See Rhymes:English:-ɛri for how we handled one case where some dialects of English exhibit rhymes that others do not. I don't know if that's the only approach we're using for English (we're not famously consistent about these sorts of things), and maybe it's not the best approach for Catalan; but it's probably a decent starting-point. —RuakhTALK 03:51, 27 February 2012 (UTC)
That approach is used for some Catalan rhyme pages as well, but the issue is that currently our Catalan rhyme pages use the schwa phoneme (in the title), which exists in Central Catalan but corresponds to two different phonemes in Valencian. This means that the words on for example Rhymes:Catalan:-onə might rhyme in Central Catalan but not in Valencian, where they would be differentiated into -ona and -one. So the question is whether there should be Rhymes:Catalan:-ona and Rhymes:Catalan:-one with a notice like the one you mentioned, even though Central Catalan doesn't have unstressed -a or -e. —CodeCat 12:51, 27 February 2012 (UTC)

[edit] Proposal - complete unified login for all eligible accounts

I have created a proposal at Meta, to complete unified login for all eligible accounts. Unified login is a relatively new feature to the WMF wikis, allowing each user to have a single combined account in every project. Users that only have an account on one wiki would extend that to all wikis, and users that already have accounts on multiple wikis would have them combined. It was initially an opt-in for existing users, but it is now done by default for all new users. This leaves us with three groups of users: those with UL, those that cannot complete UL because of a naming conflict on another wiki, and those with no conflict that have simply not completed the process. I am proposing that account unification be completed for all eligible accounts without requiring the user to take any additional steps. This would make UL the rule rather than the exception that it currently is, and bring us closer to the goals of universal watchlists, recent changes, interwiki page moves, etc. This would be especially helpful on Commons, which has so many images that were originally uploaded at another WMF wiki, enabling better attribution without interwiki links. I propose that it be carried out as a one-time process rather than a continuous automatic software process, allowing users to still adjust ULs as they see fit.

If you have any opinion one way or the other, please reply at the proposal at Meta. JohnnyMrNinja 01:13, 23 February 2012 (UTC)

[edit] Misuse of rollback by SemperBlotto

Here, SemperBlotto used rollback to revert a perfectly good-faith edit without any explanation given for the revert. This is not the first time he has done this to me; nor am I the only person who he has misused rollback on. I hereby request that his rollback privileges be suspended owing to continual abuse. Purplebackpack89 (Notes Taken) (Locker) 01:20, 26 February 2012 (UTC)

A cursory examination of his talk page reveals numerous complaints about hasty deletions or reverts. This has got to stop Purplebackpack89 (Notes Taken) (Locker) 01:28, 26 February 2012 (UTC)
Good faith isn't good enough. In the edit under discussion you seem to have confused an etymology with a definition. Metro clearly functions as a word in its own right, having meaning that is not identical to either metropolitan or metropolitan area. DCDuring TALK 02:30, 26 February 2012 (UTC)
"Good faith isn't good enough". If an edit was made in good faith, it can't be rolled back, even if it's wrong. It can be fixed or undone, but not rolled back. Rollback is for bad-faith edits only. The issue here is that Semper makes reverts and deletions too quickly to be anywhere near 100% accurate about being vandalism or not (This is hardly the first time he's been inaccurate with rollback). Because of that, he should forfeit his tools. And FYI, it is a definition; in many cases "metro" is used as a synonym for the adjective use of "metropolitan", not just as a noun regarding transit. Purplebackpack89 (Notes Taken) (Locker) 04:10, 26 February 2012 (UTC)
Wrong good-faith edits can be rolled back. "Rolling back" is just a one-click version of "undoing". Admins are busy people, so if an edit is wrong enough to merit undoing, it will often be rolled back. - -sche (discuss) 05:10, 26 February 2012 (UTC)
That's a misuse of rollback to do that, -sche. SemperBlotto serially misuses it, and the deletion tool as well, bites newcomers, and doesn't assume good faith. Frankly, I cannot understand how he is still an admin Purplebackpack89 (Notes Taken) (Locker) 05:39, 26 February 2012 (UTC)
DCDuring and -sche both clearly feel that it can be O.K. to roll back good-faith edits, and I'll add my voice to their chorus. Do you have any evidence for your contrary claim? For example, can you link to a Wiktionary policy or guideline on the subject? —RuakhTALK 05:51, 26 February 2012 (UTC)
Lemme turn the tables on you...on any other WikiMedia project, rollback can't be used for good-faith edits. Where's the policy or guideline that says we can or should here? Purplebackpack89 (Notes Taken) (Locker) 06:01, 26 February 2012 (UTC)
We conveniently don't have rollback policy, so I go to Meta.

Rolling back a good-faith edit, without explanation, may be misinterpreted as "I think your edit was no better than vandalism and reverting it doesn't need an explanation". Some editors are sensitive to such perceived slights; if you use the rollback feature other than for vandalism (for example, because undo is impractical due to the large page size), it is courteous to leave an explanation on the article's talk page or on the talk page of the user, whose edit(s) you have reverted.

So, at the very least, SemperBlotto is being discourteous and BITEy. I think it's time we got rollback policy of our own, and I propose that we follow the lead of EN and most other WikiMedia projects and state that rollback is for vandalism only Purplebackpack89 (Notes Taken) (Locker) 06:34, 26 February 2012 (UTC)
I repeat what Mglovesfun said in WT:FEED: "Something doesn't have to be vandalism to be removed, it just has to be bad. If the version rolled back to is better than the previous version I support it. Wikipedia seems to have a habit of prioritizing contributors over its articles, I'd be delighted if we didn't do the same here." - -sche (discuss) 07:43, 26 February 2012 (UTC)
@Purplebackpack89, rubbish, anything can be rolled back. I've rolled back my own good faith edits before, therefore should I lose my admin privileges?! You're making the classic mistake of assuming that we're Wikipedia, and we're not. I hate the idea that someone who makes a good faith bad edit is immune to having that edit removed; we might as well say we welcome bad edits. Mglovesfun (talk) 12:13, 26 February 2012 (UTC)
Um, there's still the undo button, and regular editing to get rid of good faith bad edits. The point is it ain't right for Semper to remove something like that without bothering to explain why Purplebackpack89 (Notes Taken) (Locker) 17:02, 26 February 2012 (UTC)

That Meta page seems to be just a help page. So it would not be a policy page on Meta; and, either way, it's definitely not a policy on Wiktionary. Even if it were a policy, it does not say "Rollbacking one good-faith edit is grounds for revoking rollback rights." The section you (Purplebackpack89) copy-pasted here is worded as an essay, rather than a rule. And the whole page focuses on Wikipedia, with jargon like "article" (we say "entry"), "encyclopedic" and "the processes in dispute resolution".

In particular, the idea of always explaining about reverts on users' talk pages looks somewhat good on paper, but:

  • It would be very cumbersome to implement: sometimes we do that, but there are so many edits to be reverted and so few people to do the work (mostly Semper alone).
  • Here it would be useless most of the time. Wikipedia has long articles, with their wordings, coverage, "notability", extensive sections and so on. When an edit (particularly a big edit) of Wikipedia is reverted, it can be difficult to determine why, unless someone explains. When an edit in Wiktionary that fits the standardized system (is formatted with the right sections, lines and is not gibberish like "glrbglblggbrlb" or "LOL FAG") is reverted, the obvious justification commonly is "This entry would be better without these new five or twelve words that you added." If you defined metro as # Abbreviation of [[metropolitan]]., then the obvious "explanation" implied in SemperBlotto's action is "I believe 'metro' is not an abbreviation of metropolitan." (or this variation: "I believe you should not say that metro is an abbreviation of metropolitan.") Do you really need more than that?

You already came here and got your explanation. Your edit is gone, as it should be. It doesn't matter whether the rollback function did it, or it was the "undo" button, or that a meteor crashed on the servers and flipped a few bytes. If it was really in good faith, I suppose you can accept its short life and move on.

P.S.: meta:Rollback says "If your material is reverted, don't take it personally." --Daniel 13:03, 26 February 2012 (UTC)

WT is unlike WP in that it's fairly liberal about references. Being more lax with references means being quicker to revert edits. The only support for contributions without references comes from the approval of other editors, and the edit in question failed that test, so if it's really a good edit, someone must provide some form of verification. --Haplology (talk) 13:48, 26 February 2012 (UTC)

We do have Help:Reverting, which mentions that "Reverting vandalism is obviously acceptable, as is reverting copyright violation and edits that do not conform to our Criteria for inclusion." (italics mine). Maybe SemperBlotto removed it because you placed that definition in the Noun section. Ungoliant MMDCCLXIV 14:19, 26 February 2012 (UTC)

I wholly support what SB did. If such an edit of yours is contested in future, add cites or lump it. — Raifʻhār Doremítzwr ~ (U · T · C) ~ 16:10, 26 February 2012 (UTC)

Quoting Purplebackpack89 "That's a misuse of rollback to do that". No it isn't. In your opinion, I'm sure it's a misuse, but on this wiki there's no rule about it and as this discussion has shown, there's no consensus to consider it a misuse, the opposite in fact, so may I suggest now is a good time to drop the matter entirely. As I like to say, if you don't want your bad edits reverted, don't make any bad edits. Mglovesfun (talk) 19:53, 26 February 2012 (UTC)

[edit] Brand names and physical product

I have created Wiktionary:Votes/pl-2012-02/Brand names and physical product, as several people act in RFV as if the wording of "physical product" were not part of WT:BRAND. Thus, there is some support for getting the wording of "physical product" removed, and let us see how big that support really is. I am going to oppose.

I have left the rationale empty. Those who support the proposal have to come up with a rationale, or leave it empty if they oppose rationales in votes.

My rationale for opposing the removal is that the wording makes the already needlessly exclusionist WT:BRAND even more exclusionist, disregarding lexicographical merit of entries. If I could decide, I would drop WT:BRAND rather than making it stronger. Unfortunately, WT:BRAND has been voted on.

Feel free to postpone the vote should the discussion last until the planned start of the vote, which is 4 March 2012. --Dan Polansky (talk) 17:05, 26 February 2012 (UTC)

Interesting idea, and I agree that a product does have to be physical in some way. Though physical doesn't need to mean tangible, electricity could be physical for example. But something which is an idea can not on its own be physical. Bugs Bunny isn't physical, though manifestations of it can be physical, such as a toy. Mglovesfun (talk) 19:40, 26 February 2012 (UTC)
Would a book title, exclusively distributed by electronic means, be a branded product by that reasoning? What about the physical representation of an idea in the brain? What about the physical representation of a brand as a sequence of letters on a piece of paper or a storefront? DCDuring TALK 20:08, 26 February 2012 (UTC)
That's what I mean, anything can be represented physically, but the representation is not the thing itself. If I write the word chair on a piece of paper, it's not a chair. But if I hold a can of Lynx deodorant in my hand it is a can of Lynx deodorant. Geddit? Mglovesfun (talk) 23:14, 26 February 2012 (UTC)
I think I geddit. I also think brands are different from physical entities, but they are embodied in physical objects. "Tony the Tiger" is a trademark associated with Kellogg's Frosted Flakes. What about a "Bugs Bunny" doll? What about "Warner Brothers" or "WB"? What about a patch with "John Deere" or "Citibank" on it? What about an envelope or letterhead stationery with a brand name and logo? DCDuring TALK 23:58, 26 February 2012 (UTC)

You are receiving this email because you subscribed to this feed at blogtrottr.com.

If you no longer wish to receive these emails, you can unsubscribe from this feed, or manage all your subscriptions