Wiktionary:Grease pit/2013/April Apr 27th 2013, 01:50 | | Line 514: | Line 514: | | ::: The complexity and the speed problems will happen because we will have to write a lot of code just to make a Lua module "understand" this new programming language that we will be creating. Essentially, we will be implementing a language within another language. So why should we reinvent the wheel when we already have a language that works and that everyone is familiar with: template code? The presence of the translation editor makes this even more redundant, because users won't even need to interact with the code. As long as there is no guarantee that this will not make things worse, I don't see how you could support it unless you do not actually have a full grasp of the implications of such a project. {{User:CodeCat/signature}} 23:24, 26 April 2013 (UTC) | | ::: The complexity and the speed problems will happen because we will have to write a lot of code just to make a Lua module "understand" this new programming language that we will be creating. Essentially, we will be implementing a language within another language. So why should we reinvent the wheel when we already have a language that works and that everyone is familiar with: template code? The presence of the translation editor makes this even more redundant, because users won't even need to interact with the code. As long as there is no guarantee that this will not make things worse, I don't see how you could support it unless you do not actually have a full grasp of the implications of such a project. {{User:CodeCat/signature}} 23:24, 26 April 2013 (UTC) | | ::::"This will, at best, become something like #Module:links which eventually went nowhere" oh come on Lua is really, really new, it's way too early to say 'eventually went nowhere'. [[User:Mglovesfun|Mglovesfun]] ([[User talk:Mglovesfun|talk]]) 23:42, 26 April 2013 (UTC) | | ::::"This will, at best, become something like #Module:links which eventually went nowhere" oh come on Lua is really, really new, it's way too early to say 'eventually went nowhere'. [[User:Mglovesfun|Mglovesfun]] ([[User talk:Mglovesfun|talk]]) 23:42, 26 April 2013 (UTC) | | + | ::: I did a quick search for names of languages indigenous to the British Isles to give a very rough idea of the complexity involved: | | + | *Irish, Erse, Irish Gaelic, Hibernian Gaelic, Gaeilge, Gaelige, Gaedhlag, Gaedhilge, Gaedhilic, Gaeilic, Gaeilig, Gaelic | | + | *Manx, Manks, Manx Gaelic, Gaelg, Gailck | | + | *Scots Gaelic, Scotch Gaelic, Scots-Gaelic, Scotch-Gaelic, Caledonian Gaelic, Erse, Gàidhlig, Gaelic | | + | *Welsh, Welch, Cambrian, Cambric, Cymric, Cymraeg | | + | *Cornish, Kernowek, Kernewek | | + | *Old English, Anglo-Saxon, Anglo Saxon, Anglosaxon, Englisc | | + | *Scots, Scotch, Inglis | | + | True, more than half of these are obscure, and unlikely to be used here- though there are enough works with odd or old usage in places like Google Books that it's hard to be categorical. It's also true there are probably other names I missed, and variants in the Celtic languages due to consonant mutation. | | + | | | + | What does the code do when someone puts "Gaelic", or "Scottish", or uses some misspelling that could be interpreted as more than one of the names for the Goidelic language- which all look very similar. There are several other cases where the same name is used for more than one language. It's true that template code has problems with people using se for sv, lt for lv or la, and so on- but at least, template code ''looks'' tricky, so people are more likely to look up the correct code. The new format looks like you can put just any old thing and the system will figure it out: in effect, it appears as if it's offering to do the nitpicky part for you. In reality, though, it will probably have its own set of constraints. It kind of reminds me of [[w:COBOL]], and all the talk of how it was going to make coding just like writing English, and make it possible for computer-illiterates to understand it. | | + | | | + | All of this can be dealt with, but it will take lots of work to educate both the system and the editors. Even if we keep both old and new going side-by-side to space out the conversion work, there's still going to be cleanup categories with the stuff the module hasn't figured out yet, and someone's going to have to go through them. It would have to be a pretty big improvement to justify the extra work. [[User:Chuck Entz|Chuck Entz]] ([[User talk:Chuck Entz|talk]]) 01:50, 27 April 2013 (UTC) | | | | | | My opinion is that it's not a terrible idea, but Conrad's translation adder already has it beat in terms of editor friendliness. Because of this, it seems to me like we would be doing a lot of work for no gain. Also, correct me if I'm wrong, as Lua is still very new to me and I don't fully understand it, but wouldn't Lua have to essentially recompile the whole thing every time someone adds a translation? Conrad's approach has the advantage of once and done, that is, the computation is done once and then hard-coded into the entry, and does not have to figured out again. -[[User:Atelaes|Atelaes]] <small>[[User talk:Atelaes|λάλει ἐμοί]]</small> 23:56, 26 April 2013 (UTC) | | My opinion is that it's not a terrible idea, but Conrad's translation adder already has it beat in terms of editor friendliness. Because of this, it seems to me like we would be doing a lot of work for no gain. Also, correct me if I'm wrong, as Lua is still very new to me and I don't fully understand it, but wouldn't Lua have to essentially recompile the whole thing every time someone adds a translation? Conrad's approach has the advantage of once and done, that is, the computation is done once and then hard-coded into the entry, and does not have to figured out again. -[[User:Atelaes|Atelaes]] <small>[[User talk:Atelaes|λάλει ἐμοί]]</small> 23:56, 26 April 2013 (UTC) |
Latest revision as of 01:50, 27 April 2013 Current code can do {{l}} 's job, and {{l}} will use the module instead of its current code soon. The aim of the module is generally handling wikilinks, though -- not just in {{l}} , but in {{term}} , head templates, and other similar templates that create wikilinks. Some new features have been proposed at Template_talk:l#Lua-ising. The code for the features has been written and tested, we just need to gain official community consensus to implement it. Any thoughts or suggestions would be welcomed. --Z 04:28, 1 April 2013 (UTC) - Have you tested it to make sure it works in all cases that
{{l}} works, and that it doesn't do anything it shouldn't? Also, what is the purpose of Module:useful stuff? "detect_script" in particular doesn't seem like it does anything useful. And the list of languages that have automated transliteration should be in Module:languages. I also warned you not to start adding all kinds of extra code to this until we're sure that it works the way it should. —CodeCat 13:41, 1 April 2013 (UTC) - Assuming "detect_script" does what it seems to based on its name, that would be extremely useful. Several templates for multiscriptal languages like Tatar, Ladino, and Japanese have parameters that require the user to input what script an entry is in. If we can scrap that, that's be great. —Μετάknowledgediscuss/deeds 14:20, 1 April 2013 (UTC)
- But what do you do when the word contains characters in multiple scripts? —CodeCat 16:40, 1 April 2013 (UTC)
- That doesn't happen in Tatar or Ladino. It does happen in Japanese, and I'm not sure how it works. For example, アメリカ合衆国 is in both katakana and kanji, but is marked as katakana (in the template, that's kk). We'll have to ask a Japanese editor. —Μετάknowledgediscuss/deeds 00:45, 2 April 2013 (UTC)
-
-
-
-
- Why use
kk , the language code for Kazakh? In case it matters, the Japanese ISO script codes are:[1]
-
-
-
-
-
Hira : Hiragana Kana : Katakana Hrkt : Japanese syllabaries (alias for Hiragana + Katakana) Jpan : Japanese (alias for Han + Hiragana + Katakana)
-
-
-
-
- —Michael Z. 2013-04-03 21:38 z
- This is totally off-topic, but I guess it's a valid complaint about the template. The answer is that it's faster to type, just like pl= (code for Polish, means plural in templates) or tr= (code for Turkish, means transliteration in templates). In cases like these, editors' ease definitely outweighs using ISO script codes, because it really doesn't matter which we use. —Μετάknowledgediscuss/deeds 23:45, 3 April 2013 (UTC)
- Where is the code for the version with the proposed features?
- Automatically detecting script sounds like a very good idea, imo. --Yair rand (talk) 01:22, 4 April 2013 (UTC)
- Some are removed by CodeCat and you can find them in older revisions, some were moved to this module, others are in commented part of the code, e.g. recognizing reconstructed terms from "*" and linking to appendix is in prepare_title(). --Z 01:42, 4 April 2013 (UTC)
- Can somebody please help me work out how to implement script recognition in
{{tt-pos}} ? —Μετάknowledgediscuss/deeds 02:31, 4 April 2013 (UTC) - I'm not opposed to these innovations in principle, but I do think that we should first get
{{l}} to work with this module first, and keep it that way for at least a week or two so that we can be sure there are no unexpected problems. —CodeCat 03:20, 4 April 2013 (UTC) - Why would we want to get
{{l}} working with it first? That might could be a while... —Μετάknowledgediscuss/deeds 04:22, 4 April 2013 (UTC)
- Currently detect_script() can't be invoked from templates, it's a better idea to rewrite that template in Lua. --Z 03:29, 4 April 2013 (UTC)
- Is it? I was hoping to have a model off which I might be able to design more templates with these features. —Μετάknowledgediscuss/deeds 04:22, 4 April 2013 (UTC)
- It would be possible after Lua-ization of
{{l}} and {{head}} and adding the ability of detecting scripts to them. --Z 05:08, 4 April 2013 (UTC)
- Regarding Japanese, I have no idea about how its writing system works, but it's possible to find katakana characters of a word and tag it with Kana class, and other non-katakana characters of the word (if there is any) would be kanji, I assume? If so, it's easy to fix. Does similar thing happen in any other language? --Z 03:29, 4 April 2013 (UTC)
- Not that I can think of, but we should assume so just to be safe. —Μετάknowledgediscuss/deeds 04:22, 4 April 2013 (UTC)
- The module is tested, the only problem is gender/number part -- output of Module:gender and number and those of gender/number templates are not identical. That's not much of a problem though, we can use the gender templates in the module for now. --Z 05:32, 4 April 2013 (UTC)
- Just use the module. The output of the module can always be changed if necessary, that isn't a good reason not to use it. —CodeCat 22:59, 5 April 2013 (UTC)
- Ok, change it then. I can't even edit the page, you locked it. The gender templates should be used until gender_and_number is fixed.
- The outputs of Template:l and Module:l are identical now (don't know why the forth test fails, they're the same really). I won't have regular internet access after 14th and apparently no one else cares about working on the module, so I would be grateful if Module:l is implemented soon so that I would be able to work on the extra features in this period of time. --Z 05:14, 10 April 2013 (UTC)
[edit] Substring module Thanks, Z! New question: do we have a basic string manipulation module, just to store stuff like taking a substring of a certain length from the end of a word, etc? If not, should I create Module:string or something? —Μετάknowledgediscuss/deeds 00:23, 5 April 2013 (UTC) - NP, no that's not needed, for this certain task you can simply use string.sub(). --Z 00:39, 5 April 2013 (UTC)
- But I want to invoke it directly in an un-Luacized template. That's why I reckon there should be a separate module for it. —Μετάknowledgediscuss/deeds 22:51, 5 April 2013 (UTC)
-
- I see a need to decompose words in any script into components, including any diacritics and ligatures. Use: say you want to check how to pronounce a word in a complex script with diacritics - Burmese, Hindi, Bengali, Thai, Arabic, Hebrew, etc. A Devanagari syllable रा (rā) can't be looked up in Wiktionary:Hindi transliteration because it's र + ा and you can't take out the diacritic ा from रा to look it up. Some word processors allow to break strings into parts. So, yes, please. Not just the substring but a break up.
-
- Module:ko-hangul has the function syllable2Jamo, which in debug mode shows individual jamo for each hangeul (한 (han) = ㅎㅏㄴ (h a n). Need to make it break up hangeul in the run mode. Tried to do it with "syllable2JamoSep" but didn't work. --Anatoli (обсудить/вклад) 00:41, 5 April 2013 (UTC)
-
-
- If I understand you correctly, you need
mw.ustring.gsub(text, "(.)", "%1 ") (try print(mw.ustring.gsub("रा", "(.)", "%1 ")) in console). --Z 00:58, 5 April 2013 (UTC)
- Module:String exists on Wikipedia. Because it doesn't exist here yet, I copied the entire code and added extra bits to it when I wrote Module:bo-translit. Wyang (talk) 01:01, 5 April 2013 (UTC)
-
- Great stuff, thank you! I also tried Thai "เค็ม" = เ ค ็ ม and Arabic "اَلْلُغَةُ ٱلْعَرَبِيَّةُ" = ا َ ل ْ ل ُ غ َ ة ُ ٱ ل ْ ع َ ر َ ب ِ ي َ ّ ة ُ. --Anatoli (обсудить/вклад) 01:11, 5 April 2013 (UTC)
-
-
- Wyang, it seems you're able to do auto-translit for a few complicated script languages, especially those you're familiar with, you've done Burmese before in the Chinese Wiktionary, haven't you? I'm especially keen if you could do Hindi, Bengali, Thai, Lao, Burmese and Khmer in any order, possibly Sinhalese. Korean module needs to be made better as well. I'm happy to assist with testing and getting/checking for translit. rules. I'm only familiar to some degree with Korean, Hindi and Thai. Angr is our Burmese expert and Stephen G. Brown knows a bunch, including Khmer and Telugu (no tones in Khmer, so it's simpler than some others). ZxxZxxZ has done a great job with Arabic and Persian but you can only do that much with partially phonetic languages. --Anatoli (обсудить/вклад) 01:23, 5 April 2013 (UTC)
- Wyang knows a lot about Burmese himself, I doubt he'll need much if any help from me for it! —Angr 10:18, 5 April 2013 (UTC)
-
-
-
-
- OK. Anyway, I hoped that function
print(mw.ustring.gsub("လုံချည်", "(.)", "%1 ")) could also be used for reading out individual characters, so that one could look up each character in e.g. WT:MY TR but are we missing some characters in လ ု ံ ခ ျ ည ်? I can't find some characters, e.g. ု. --Anatoli (обсудить/вклад) 10:59, 5 April 2013 (UTC)
-
-
-
-
-
- ု alone is "u." in MLCTS, ု+ ံ is "um". They are both on the my-translit page. As for the transliteration modules, I'll try to get more familiar with Lua first. While certain string decomposition functions are easier to use than wiki code, some others seem less straightforward, for example the equivalent of {{#switch: (?) Wyang (talk) 11:06, 5 April 2013 (UTC)
- What exactly is the purpose of all these string functions when most of them already have more complete Scribunto equivalents? Isn't this just reinventing the wheel? —CodeCat 22:57, 5 April 2013 (UTC)
No one really answered my original question. For example, we need to strip leading hyphens from suffixes for sorting purposes. Where should I put the function that does that? —Μετάknowledgediscuss/deeds 05:44, 17 April 2013 (UTC) - Use
mw.text.trim(text, "%-") . Almost whatever string-related function you think of is already defined in Lua and Scribunto. --Z 06:42, 17 April 2013 (UTC) - You don't get it. I know that, I just want to know where to put it. We need to be organized before we put stuff in templates. —Μετάknowledgediscuss/deeds 13:53, 17 April 2013 (UTC)
- I understood that was just an example, but as I said, basic string-related functions are already defined and I can't think of anything that isn't defined. Maybe I'm misunderstand what you mean, so lets wait for another person to comment. --Z 14:24, 17 April 2013 (UTC)
- Non-basic string functions are usually language-specific, which we have put them in Module:xx-common modules so far, where xx is the language code. Anything else (if there is any) may be put here. --Z 14:28, 17 April 2013 (UTC)
- So I should put this in Module:useful stuff? And what should I name the function, suffixSort? —Μετάknowledgediscuss/deeds 14:33, 17 April 2013 (UTC)
- Yeah, that looks fine. There was a worthless timewasting discussion about what to name that module, and after thinking a lot, I came up with that title, because I think thinking about what to name a module is ridiculous; no one cares what it is named, only that it works, so I named it after the first thing came to my mind, I suggest you do that too. --Z 15:08, 17 April 2013 (UTC)
[edit] Documentation subpage tab for templates As an example, {{head}} , at the top the tab is still Template:head/doc but we're migrating these to /documentation (Template:head/documentation). I guess there's a MediaWiki page somewhere that needs updating. Please. Mglovesfun (talk) 12:23, 5 April 2013 (UTC) - Yes, but currently the majority still uses /doc. Would it be possible for it to support both, until all the subpages have been moved over?
{{documentation}} already does this (it shows /doc if it exists, but prefers /documentation otherwise) —CodeCat 14:17, 5 April 2013 (UTC)
[edit] An idea for a new way to make form bots This idea just kind of came to me but I think it could be very useful. The way that User:MewBot and probably all the other bots currently work is that they parse the invocation of the template, and then try to mimic the template as closely as possible. This is really a lot of work and it's not ideal, because it means all the code has to be duplicated and kept synchronised. So I thought... why not let another template or Lua generate the forms in a machine-readable format? That way, the bot only has to understand the output, but no longer has to duplicate any of the intricacies of the template/module. I have added this to Module:nl-verb and less than 5 minutes later I already have a working example. :) See User:CodeCat/bot example. To use it on any Dutch verb table, just add the parameter bot=1 to the template. It is really easy to do as long as your template or module has a strict separation between the part that generates all the forms and the part that displays the table. By writing a second output function that displays a list of forms instead of a fancy table, you can get this result. It can even be done without Lua at all, but Lua does make it a lot easier. So what can we do with this? In its current form, a bot script could "find" the inflection template on a page like it does before, but it could then add the bot=1 parameter and expand the template (via the MediaWiki API, which is built into the Python wikibot framework). It can then parse the machine-readable output and use that to create entries for each of the forms as before, instead of having to generate all the forms itself. However, this concept could be taken further. The template or module that generates the machine-readable format could actually generate the full form-of entries itself, so that the bot doesn't even need to parse the output but could just flat out create entries straight from it. This, in turn, would open up one huge door: a single bot that can do any inflection table, in any language, without modifications, as long as the template/module generates the proper output. —CodeCat 01:31, 6 April 2013 (UTC) - I've now made these changes to User:MewBot, and it seems to work ok with the few test entries I've tried it on. —CodeCat 22:01, 6 April 2013 (UTC)
- Looks good to me. —RuakhTALK 23:54, 6 April 2013 (UTC)
[edit] toolserver I've tried to use several toolserver utilities yesterday and this morning, such as SUL accounts, and I have not been able to make a connection. Is toolserver temporarily down, or is it gone? I remember someone saying that the toolserver tools were being migrated to Wikimedia Labs. If the SUL accounts tool is now at Wikimedia Labs, is there a way to fix the links here (SUL accounts link appears at the bottom of anyone's contributions page, such as Special:Contributions/Stephen_G._Brown)? —Stephen (Talk) 06:16, 6 April 2013 (UTC) - The toolserver has issues, but it is unrelated to tools migrations. Dakdada (talk) 15:40, 6 April 2013 (UTC)
-
- Thanks. It seems to be working again, although very slow. —Stephen (Talk) 08:48, 7 April 2013 (UTC)
[edit] Limits on display of "orange link for missing language section" If one checks the appropriate box in per-browser preferences, links display in orange instead of blue "if the target language is missing on an existing page" (actually if the specified section of whatever kind is missing), eg, plate#Latin. plata#Latin displays blue though there is no Latin section at [[plata]]. Does anyone know whether there some kind of limit on the number of headings that are searched in the operation of the this feature? DCDuring TALK 12:28, 6 April 2013 (UTC) - No idea, but the reverse happens too. In the etymology sections of သစ်, 薪#Mandarin, 薪#Cantonese, 新#Middle Chinese, and 新#Cantonese all appear orange even though those sections do exist. —Angr 12:36, 6 April 2013 (UTC)
- Angr, that's a different problem. I'm pretty sure that's because none of the sections you linked to has a definition yet. —Μετάknowledgediscuss/deeds 16:45, 6 April 2013 (UTC)
- Uncategorized sections come up orange, yes. Mglovesfun (talk) 18:21, 6 April 2013 (UTC)
- OK, that's good to know. But what about DCDuring's issue? Why is plata#Latin blue? —Angr 18:28, 6 April 2013 (UTC)
- I searched the plata page for the word "Latin". It appeared in a context note in the second Spanish def.
- After deleting that context note, so the word "Latin" doesn't appear on that page, the plata#Latin link now appears in orange.
- Yair, that looks like a bug, maybe in page parsing -- would that be hard for you to fix? -- Eiríkr Útlendi │ Tala við mig 19:47, 6 April 2013 (UTC)
- No, that's the same problem. It's not that you removed Latin from the text, it's that you removed the page from Category:Latin American Spanish. The script works by examining a page's categories; if the page appears in a category that starts with a given language-name, then the script infers that the page has a section for that language. (In particular, the script does not download and parse the page. That would give more accurate results, but would be prohibitively expensive.) —RuakhTALK 00:09, 7 April 2013 (UTC)
- And that answers the lingering question I had about why orange appears when I try to link to a PoS or Etymology section in English. DCDuring TALK 00:40, 7 April 2013 (UTC)
- Does the script not look at hidden categories? 新 has a hidden Cantonese category (definition needed). Why not use it? DCDuring TALK 00:44, 7 April 2013 (UTC)
-
-
- Re: "Does the script not look at hidden categories?": Correct, it doesn't. Re: "Why not […] ?": Not being the author, I can only speculate, but it sort of makes sense to me: if our only Cantonese category is Category:Cantonese definitions needed, then we arguably don't really have a Cantonese entry. I mean, sure, we've got the ==Cantonese== L2 header, but we don't even identify the POS? —RuakhTALK 02:20, 7 April 2013 (UTC)
- I don't know if it would help or even be useful, but we could restrict the script further by requiring that a PoS category have a specific name. The script would have a list of possible PoS names which it could choose from, and it would make the link orange if it finds no match. Since "American Spanish" isn't a PoS category, it would fix the problem above. —CodeCat 02:31, 7 April 2013 (UTC)
- That doesn't seem like it should impose much of a performance penalty. (Does Lua/Scribunto help?) If there is any significant performance penalty, should just learn to live with the problem.
- I guess there could be other instances of a name of regional dialect or dialect grouping starting with a language name. DCDuring TALK 04:00, 7 April 2013 (UTC)
- No performance penalty, no. (Also, that part of it would be handled entirely on the client-side (your browser), via JavaScript, so even if it did have a performance impact, the considerations would be a bit different than for stuff that runs on the server-side.) —RuakhTALK 05:26, 7 April 2013 (UTC)
- have#Etymology also comes up orange (well, yellow to me) because there are no categories starting with the word Etymology. Mglovesfun (talk) 11:33, 7 April 2013 (UTC)
- We could apply the same idea to that as well. If the script is able to recognise which categories are valid, maybe it could also tell a valid language from an invalid one. The problem, of course, is that there are hundreds of languages, so putting them all into the script might be overkill. I guess this is yet another example of a case where linking to non-language sections just doesn't work. And I kind of agree with that anyway, because have#Etymology would link to the first Etymology section on the page. That works out right in this case, but what about cantar#Conjugation? And never mind if we want to link to the etymology of another language, then you're out of luck... —CodeCat 13:02, 7 April 2013 (UTC)
- It only matters for those who select the preferences box. It doesn't really matter for the users we should care most about: casual users. The simple and cheap part (PoS) may be worth installing in the script, not the language part.
- I still have hopes that links to English L2s can be to be more narrowly directed to the appropriate L2 headers like Etymology n and the PoS headers instead of running the risk of confusing users at our longer entries. DCDuring TALK 14:07, 7 April 2013 (UTC)
- I prefer the opposite, that all languages can be treated equally so that people don't get confused when something that works for English doesn't work for any other language. That doesn't mean that we wouldn't be able to link to specific sections, but we should be able to link to the specific section of any language. Currently, Mediawiki equates sections with the actual name of the header, so it assumes that every header will only ever appear once on a page. That is really a rather strange assumption. —CodeCat 14:11, 7 April 2013 (UTC)
- English is the host language. It already behaves differently.
- The less attractive we make this for normal users the less we will track real English. Plenty of users are confused by the existence of uppercase entries for English words that don't contain English sections, just German or Translingual. Why not finish the job by putting English in its alphabetical position in multilanguage entries and encourage non-English-language discussions for non-English entries? We already run the risk of turning this wiktionary into one that doesn't track current UK or US English, but some blend of what Webster 1913 tracked and Globish. And there will be fewer to challenge the dated, archaic, and obsolete glosses in our FL entries. But it will still be fun for polyglots. DCDuring TALK 14:25, 7 April 2013 (UTC)
- "Bla bla bla think of the children, it will be doom if we allow such transgressions". Ok, can you actually read what I am saying and not go off on a panic spree? —CodeCat 16:07, 7 April 2013 (UTC)
- Fixed. --Yair rand (talk) 22:22, 7 April 2013 (UTC)
- What is fixed exactly? Not 新#Cantonese which should arguably not be.
-
- The non-L2 header links? So, for a missing header like Gorilla#Etymology at Gorilla#Translingual it is the contributors responsibility to avoid the reference.
- Is plata#Latin fixed using CodeCat's approach? DCDuring TALK 23:06, 7 April 2013 (UTC)
- See diff. I set it to ignore categories listed in the
exceptionCategories array, which currently only contains Category:Latin American Spanish. Not an ideal solution, but I can't think of a better one that won't come with its own problems. - I think it would be best to leave the 新#Cantonese issue as it is. A link to an empty section like that isn't really much better than a broken one. --Yair rand (talk) 00:10, 8 April 2013 (UTC)
- Thanks for making the change and for the helpful explanation. If I see a problem that can't be solved by adding to the array, I'll let you know. DCDuring TALK 00:57, 8 April 2013 (UTC)
- Um, plata#Latin still shows up as blue for me, despite the lack of any Latin entries on that page. Flushing my browser cache (Chromium on Ubuntu) doesn't seem to have any effect on this issue. -- Eiríkr Útlendi │ Tala við mig 22:19, 8 April 2013 (UTC)
- If you have the per-browser preference box "Color translation links orange instead of blue if the target language is missing on an existing page." checked, I wonder whether it's indeed a browser/OS issue. Do others with Chrome on other OSs have the problem? We don't - and probably can't - have a robust capability for handling less common combinations on our local specials. DCDuring TALK 22:46, 8 April 2013 (UTC)
[edit] Very minor bug - Relevant script: MediaWiki:Gadget-PatrollingEnhancements.js
In the recent changes script that provides for admins the red bar at the bottom right, when I type in that bar it scrolls down the page. It's so innocuous I haven't bothered to report it yet. So if I type a really long deletion summary, I then have to scroll back up to click the red 'd' link to delete the offending page. Mglovesfun (talk) 11:24, 7 April 2013 (UTC) - What browser/system are you using? FWIW to whomever troubleshoots/fixes this issue, I can't reproduce that behaviour; I can type lots of text into the deletion-summary bar, and the page doesn't scroll (using Firefox 19 on Windows XP). - -sche (discuss) 22:43, 7 April 2013 (UTC)
- Hmm. Can you experiment a bit, and describe the behavior a bit more precisely? For example:
- If you type a single letter, then wait a moment, then type another letter, do you find that it scrolls a certain distance when you type the first letter, and then scrolls the same distance when you type the second letter?
- Or do you find that the scrolling only happens when you hit the space bar? (In many browsers, the space bar can be used to scroll down the page, but of course it's not supposed to do that when you type an actual space into an actual text field!)
- —RuakhTALK 23:03, 7 April 2013 (UTC)
- Couldn't duplicate just by typing in the box no matter how many spaces using FF 19.0.2. and Windows, but I didn't undertake any actual deletion. DCDuring TALK 23:13, 7 April 2013 (UTC)
- One small downward movement per character typed, not only the space bar, the same length of movement for every character. Using Google Chrome - I hadn't thought of testing it in Explorer and Firefox which I also have but no longer use. Mglovesfun (talk) 12:29, 9 April 2013 (UTC)
The category needs a bit of clean up. With or without the main entry, the structure of the pinyin entry should stay strictly a link to Chinese characters - one word per line, traditional/simplified (if different) are passed as additional parameters. There should be no other definitions, pronunciation or references. (Monosyllabic are a bit different). The parameters are unnamed. There were only a few entries, which used sim/trad, simp/trad named parameters. They are now gone and could be removed from the Template:pinyin_reading_of template. Here's an example of a correct entry where trad/simp. are different characters "biànchéng": ==Mandarin== ===Romanization=== {{cmn-pinyin}} # {{pinyin reading of|變成|变成}} I wonder what tools we have for repetitive tasks like this other than manual editing -need to remove all references to dictionaries, short translations and pronunciation sections. --Anatoli (обсудить/вклад) 01:54, 8 April 2013 (UTC) [edit] GSOC Proposal -Pronunciation Recording Extension Hello Everyone, On the wikitech-l mailing list,i saw Pronunciaton Recording Tool feature request so i felt i could give this a try and now i am planning to undertake this as my gsoc project Since i am new to open source development if anyone could help me out or mentor me through the project i would be very happy Thanks Rahul(Rahul_21 on the IRC #mediawiki,#wikimedia-dev) - A quick update on this:
- Rahul21 has collected a team, including MDale as code mentor and Lars Aronsson as the community Liaison. The Google Summer of Code application draft is firming up with feedback from WMF, Mozilla, Vorbis, and several languages of Wiktionary. If the English wiktionary wants to have influence in the project's goals, now is the time!
- - Amgine/ t·e 15:16, 16 April 2013 (UTC)
[edit] Question about data normalization and where entry data goes - Previous discussions of somewhat similar ideas: WT:BP#Restructuration_of_foreign_languages, WT:BP#Pages_getting_too_big.3F
I've been chewing on this idea for some time. Why do we put all data for all languages in one big page? This is extremely messy, from a data organization viewpoint. Putting each language on its own subpage would resolve many different issues. (Ignore, for the moment, the major amount of work required to refactor existing entries to implement this.) - Instead of putting all information for all languages that have a term spelled "ni" on the [[ni]] page, there would be one [[ni]] page with one subpage for each language: [[ni/Navajo]], [[ni/Japanese]], etc. Or, perhaps using the lang codes, giving [[ni/nv]], [[ni/ja]], etc.
- Each subpage would be transcluded onto the main page, so any reader looking at [[ni]] would perceive no change.
- One could tell immediately whether a term were missing in a given language, without having to parse the page or check categories. This would resolve, or at least help resolve, issues like as the link color oddities for plata#Latin in the thread further up this page.
- Special:WhatLinksHere would be much more useful -- one could find out much more easily whether a given template is used by a given language, for instance.
- Tabbed languages would potentially be simpler to implement.
I'm keen to find out if anyone knows why we are using the current "all languages on one page" format. I suspect it's entirely due to legacy data and momentum, but I'm aware that I may be missing some other big gotcha or limitation of the MediaWiki software that would render the "each language on a subpage" data format unworkable. Curious, -- Eiríkr Útlendi │ Tala við mig 22:36, 8 April 2013 (UTC) - I would support this idea and have supported it in the past, but it seems to have enough opposition here that it never actually gets any further than an idea. —CodeCat 22:50, 8 April 2013 (UTC)
- I somehow like the reverse naming: [[ja/ni]], [[nv/ni]], but I guess this is harder to use. Wyang (talk) 23:13, 8 April 2013 (UTC)
-
- Would the parent pages ([[ja]] and [[ni]] in this example) then be the indices for each language?
- However, that does run into the problem that [[ja]] and [[ni]] are already existing pages. -- Eiríkr Útlendi │ Tala við mig 23:39, 8 April 2013 (UTC)
-
- All those look like real advantages. It would also mean that template such as
{{l}} would be unnecessary for same-language links and that omitted lang= parameters in {{term}} etc could default to the same language. It would also dramatically improve load time for non-English terms that were homographs of English terms, especially those with really large translation tables. The load-time problem for English terms with large translation tables would not be helped significantly, unless translation tables, too, were on a separate subpage, not automatically transcluded. - How should Translingual pages (characters, symbols, taxons) be handled in such a regime? DCDuring TALK 23:18, 8 April 2013 (UTC)
- Translingual entries would presumably be at the */mul subpage, so [[ni/mul]] in this example. My thought is that the top-level term page would only ever be the container, but perhaps some other arrangement might work better. -- Eiríkr Útlendi │ Tala við mig 23:39, 8 April 2013 (UTC)
- Translingual entries are supposed to be useful in many languages. Taxons are usable by biology professionals even in languages using non-Latin script. Characters and symbols have similar broad reach. That's the justification for having them at the top of the page, so that there is no need to repeat the content in every (applicable) language. We would need some kind of obvious way of reminding users of these things and might have to be much more explicit about the precise scope of the Translingual terms and each particular sense thereof. We have largely finessed this point. DCDuring TALK 02:03, 9 April 2013 (UTC)
-
-
- Agreed. I'm not sure how this affects this proposal, however? The idea is that anyone looking at [[ni]] (or any other page) as a reader would see exactly what's already there. -- Eiríkr Útlendi │ Tala við mig 05:08, 9 April 2013 (UTC)
- If all the content is transcluded, then the page-load time improvement doesn't apply. In fact page-loads would be slightly worsened. DCDuring TALK 11:06, 9 April 2013 (UTC)
- Both arrangements have advantages. When it comes to templates or modules, it doesn't really matter because we can extract both the base name and the subpage name. One advantage of putting the word first is that it matches our current arrangement somewhat more, because each base page would then have one subpage for every language. On the other hand, the reverse arrangement with the language first is more like how we treat Appendix entries. I don't think either one really has any clear advantages or disadvantages, it's more down to our own preference and logical approach to entries. Another question we need to answer though is whether we use language names or codes in the title. And what to do with the thousands of bare links and uses of
{{term}} without a lang= parameter, which will break? —CodeCat 00:46, 9 April 2013 (UTC) - Lang codes would be shorter. Lang names would be more human-friendly.
- Why not use both? Lang names could redirect to the lang code subpages, or the reverse, as deemed appropriate.
{{term}} would presumably just link to the bare entry, [[ni]] in this case, which would be the container into which all of the language pages would be transcluded. [[ni/mul]] would go at the top if it exists, followed then by [[ni/en]], and then all the other langs in alphabetical order. -- Eiríkr Útlendi │ Tala við mig 00:51, 9 April 2013 (UTC)
-
- The only way to make redirects like that work is to have a redirect for every entry. I don't see that happening... —CodeCat 01:21, 9 April 2013 (UTC)
- We have bots for basic maintenance stuff, no?
- Assuming we're putting the data under the lang code, then a bot would check for each [[ni/langcode]], to see if there is a corresponding [[ni/langname]]. If it's missing, the bot would create it as a redirect to [[ni/langcode]].
- But that's even assuming that we'd want both lang code and lang name URLs. -- Eiríkr Útlendi │ Tala við mig 05:12, 9 April 2013 (UTC)
- I don't see any significant possible benefit, and a lot of potential downsides. Anyway, we have no real technical means to do this. --Yair rand (talk) 01:01, 9 April 2013 (UTC)
- Could you add more detail to that? I'm not aware of the downsides, which is partly why I asked.
- I'm also a bit confused about your comment that "we have no real technical means to do this" -- this would be very bot-able, as there's nothing that complicated involved in changing the structure of existing entries, just the tedium of actually doing so. -- Eiríkr Útlendi │ Tala við mig 01:19, 9 April 2013 (UTC)
- We don't have any way that I know of to have each page display the contents of each of its subpages. The downsides: Categories would be severely messed up. The category pages would be packed with languages codes, and would link to problematic "part-entries". The entries either wouldn't list categories at all at the bottom, or would be doubled in the category page. Special:WhatLinksHere would also go to the "part-entries", and would additionally contain main entry duplicates for every link. The search bar would be clogged with extra suggestions that wouldn't mean much to the users. --Yair rand (talk) 01:34, 9 April 2013 (UTC)
-
-
-
- Splitting these various issues up for reply.
- "Categories would be severely messed up."
- "The category pages would be packed with languages codes,"
- Not necessarily a problem. I'd actually prefer it if I could tell at a glance whether a term in a given language were present in a category.
- Is this helpful to our readers? Afaik, all mainspace categories are single-language. Having "/French" added to every link would just cause confusion. --Yair rand (talk) 06:44, 9 April 2013 (UTC)
- "[The category pages] would link to problematic "part-entries"."
- I think maybe you've misunderstood what I'm proposing? Or maybe I've misunderstood what you mean by "part-entries"? The idea is that the whole Japanese entry would be moved to [[ni/ja]] or [[ni/Japanese]], depending on whether folks prefer lang codes or lang names. There wouldn't be any part-entries.
- If you think that having the entire Japanese entry for ni at [[ni/Japanese]] would be problematic, I don't understand what would be problematic about that. Could you explain?
- I thought you were suggesting that the viewing of entries would still take place through a central non-single-language page so that there could still be quick switching between languages. If so, then links that go directly to single-language entries ("part-entries") as though they were full entries would be problematic. --Yair rand (talk) 06:44, 9 April 2013 (UTC)
- Unclear what the problem would be. If a reader clicks a link that is intended by both the reader and the editor who added the link to lead the reader to the Romanian entry, then I fail to see any problem at all if the user does not see the Welsh entry. Actually, not seeing the Welsh entry could be argued to be a bonus, rather than problematic.
- If instead you mean that the problem is ease of changing between languages for any given term, what I'm envisioning would ultimately be a combination of a parent page (see sample at [[User:Eirikr/Sandbox3/ni]]) that would be identical to our current format for readers, and the individual language subpages (such as [[User:Eirikr/Sandbox3/ni/sq]] or [[User:Eirikr/Sandbox3/ni/cy]]) which would ultimately be quite similar in appearance to Tabbed Languages. One would provide an all-in-one view, the other would provide just the target language, with links at the top to the others.
{{subpages}} already offers a basis from which to create such a header. -- Eiríkr Útlendi │ Tala við mig 02:49, 10 April 2013 (UTC)
- "The entries either wouldn't list categories at all at the bottom,"
- I just created a test sample at [[User:Eirikr/Sandbox3/ni]]. All cats on the various sub-pages appear on the parent page (excluding those cats where the including template has logic to check the namespace and only include the cat if in the main namespace).
- "...or [the entries] would be doubled in the category page."
- Category:Swahili_terms_needing_attention is called from the test page. I do see both [[User:Eirikr/Sandbox3/ni]] and [[User:Eirikr/Sandbox3/ni/sw]] listed there now.
- While this is an issue, it seems little more than a minor nuisance, and is not insurmountable. For categories included by templates, the templates could contain logic to limit category inclusion to only lang-specific sub-pages.
- Fixing this problem would cause the one mentioned above, and vice versa. --Yair rand (talk) 06:44, 9 April 2013 (UTC)
- "Special:WhatLinksHere would also go to the "part-entries","
- See above about "part-entries".
- "and [Special:WhatLinksHere] would additionally contain main entry duplicates for every link. "
- Yes, this would be an issue, but again, it seems little more than a minor nuisance.
- "The search bar would be clogged with extra suggestions that wouldn't mean much to the users."
- This could be at least partially addressed by using lang names instead of lang codes in the URLs. If a user searches for [[ni]] in search of the Zulu term, and [[ni/Zulu]] is one of the hits, they know right where to go.
- Past there, the search index hasn't yet been updated to include my test page.
- So aside from the "part-entries" bit where I'm not sure what you mean, it looks like the net negative effect would be a couple of minor nuisances.
- In terms of positives, just on the surface of it, it would be much easier to tell what languages have an entry for any given term. This does away with a whole class of problems, including the script rejiggering required just earlier this week to handle lang-specific orange links. Does that really qualify as "[not] any significant possible benefit"?
- -- Eiríkr Útlendi │ Tala við mig 06:18, 9 April 2013 (UTC)
- That issue is bug 16561, which is probably far more likely to be fixed than the necessary changes to split entries into language subpages, largely because identifying broken section links is also a somewhat important issue for a certain large sister project of ours. --Yair rand (talk) 06:44, 9 April 2013 (UTC)
- I see a lack of any comments in that thread since late 2010. It's also not entirely clear what "certain large sister project of ours" you intend; I assume you mean Wikipedia, but that isn't very clear from the thread. -- Eiríkr Útlendi │ Tala við mig 02:49, 10 April 2013 (UTC)
-
-
- Sounds like a great idea to me.
-
-
- But why transclude the entries into the root page? Just put an automatic index there, floated to the right of the the multilingual entry. Categories should collect entries okay, but can they display their titles correctly? The only problem I can see is displaying the title at the top of a language entry's page.
-
-
- If we're reorganizing, is there a way to avoid adding a level of subheadings for "Etymology 1," &c.? —Michael Z. 2013-04-09 02:50 z
-
-
-
- I proposed transclusion simply because that can be done right now, whereas an automatic index would presumably require that someone code one up first. Now that we have Lua, that should be easier to do. It could also have the side benefit of obviating the nuisance issues mentioned above.
- However, I'm also keen to avoid any major disruption to readers, and transclusion into the parent page would result in an entry page visually identical to what we already have.
-
-
-
-
- I mean while rat#Noun is an <h3> heading as expected, but, inconsistently, root#Noun is an <h4> because it is pushed down by the "Etymology 1" and "Etymology 2" heading. Bugs me. —Michael Z. 2013-04-09 23:02 z
- Why is it that WT:ELE has us do it the way we do? Is it a relic of a pre-CSS approach to making an entry's heading look good? DCDuring TALK 23:10, 9 April 2013 (UTC)
-
-
-
-
-
-
- I think it's an information-organization problem at its root. Our page/heading structure is term/language/etymology/p.o.s, e.g., Rat/English/Etymology/Noun. But we omit the Etymology 1 subheading when there is only one. There's nothing really wrong with this, although it must add a layer of complication to some bots that need to find the subheadings. I think we could style the subheadings consistently by selecting on the IDs of etymology headings, if we can pick a reasonable style for the extra etymology heading itself.
-
-
-
-
-
-
- HTML5 adds new elements (
<article> , etc) and a new document structure model that could help make sense of this, but MediaWiki doesn't support this yet. —Michael Z. 2013-04-10 01:45 z - And then there are the 1,827 members of Category:Entries_with_Pronunciation_n_headers, which don't conform to WT:ELE and can't be reconciled with it, at least in EP's opinion. DCDuring TALK 01:50, 10 April 2013 (UTC)
- I oppose splitting entries into subpages for the same reasons I opposed it the last time it was proposed. - -sche (discuss) 04:17, 9 April 2013 (UTC)
-
- Reading that link, it sounds like you would not be opposed to changes provided the reader sees no difference. Is that still the case?
- It also sounds like your opposition was partly based on different ideas to what I'm proposing here -- splitting into subpages as I'm imagining it would be based purely on language, and not have anything to do with page size. This is based more on my understanding of how our infrastructure works with terms on a by-language basis, and the kinds of workarounds required because finding out which languages have entries for any given term is more complicated than just looking at the existing URLs. -- Eiríkr Útlendi │ Tala við mig 06:26, 9 April 2013 (UTC)
- I wouldn't like it as I would like to read the wikitext of all the entries in one go, and use the auto-formatting properties of User:Mglovesfun/vector.js. However, if there were a majority in favor of it, I'm sure I'd get used to it. Mglovesfun (talk) 12:08, 9 April 2013 (UTC)
-
-
- I vehemently oppose the change; I think it would be an all-around bad move. It would not only make it more difficult for editors like me and Mg and Meta, who often edit all language sections in one go, it would make it more difficult for newcomers to start editing Wiktionary: they'd click the 'edit' link at the top of the page and see nothing but a few lines encased in curly brackets.
- Furthermore, you propose to avoid changing how entries like [[foo]] look. That is good, inasmuch as fragmenting the displayed content would cause major problems separate from those that fragmenting the actual content/wikitext would cause. But ensuring that the display does not change requires double effort on the part of editors, to always edit foo after creating foo/bar, and requires constant vigilance on the part of other editors to ensure no-one forgets to transclude a foo/bar into a foo.
- If any part of that vigilance is entrusted to a bot, the bot will have to be reliable enough that nothing slips through the cracks while the bot is down, and smart enough that it can handle or flag creations of [[foo/not-a-real-code]] and not mis-handle naming conflicts...
- ...because, as Liliana mentioned in the previous discussion, enough words are spelt with slashes that naming conflicts are inevitable. For example, s/he would seem (to a bot) to be a Hebrew entry (missing a Hebrew L2, no less!) that should be transcluded into s; it would also conflict with any real Hebrew entry [[s]], if one ever needed to be created.
- This proposal would duplicate every main-namespace page, even that supermajority of NS:0 pages which have only one language section. It would move the first-person plural imperfect active indicative of fugio to fugiebamus/la, but leave a shell at fugiebamus to contain it; likewise it would have arrodillasen/es transcluded into an otherwise empty arrodillasen, with the aforementioned constant effort required to ensure that if [[arrodillasen/foo]] ever were created, it would be transcluded. It would be easier and IMO better to leave the content at fugiebamus and arrodillasen. - -sche (discuss) 03:58, 10 April 2013 (UTC)
- Strongly oppose. It's unhelpful technically (as Yair explained above), it doesn't benefit the readers, and it's worse for editors like me who work with several related languages at a time. Looks like a classic lose-lose, and this format is one of the reasons why I don't edit on wiktionaries in other languages I speak. —Μετάknowledgediscuss/deeds 23:39, 9 April 2013 (UTC)
-
-
-
- Why doesn't it benefit the readers? When we used to hear from normal users, one common complaint was that they found the presence of many languages on the page confusing. Another common complaint was about the incredible length of the table contents. Do you have anything to support your assertion? DCDuring TALK 00:03, 10 April 2013 (UTC)
- It seems you misunderstand this. The presentation will still be the same. The page ni will still have all the languages and the long TOC. As for the long TOC, TabbedLanguages solves that, and the sooner it is implemented, the better. —Μετάknowledgediscuss/deeds 19:19, 11 April 2013 (UTC)
-
- @Metaknowledge, one thing that is currently infeasible with our "all languages on one page" organization is linking reliably to any POS header for a given language.
- For instance, the Portuguese noun on the [[ni]] page has an ID of
Noun_4 . If an additional language is added above the Portuguese entry, this now has an ID of Noun_5 , and anything that previously pointed to the Portuguese noun that was at Noun_4 is now pointing at who knows what, quite possibly at the Italian noun instead. The link URL, containing only the obscure numbered target [[ni#Noun_4]], wouldn't give editors or savvy readers any clue as to what language was intended. - By splitting languages into their own subpages, we could have much more reliable linking: [[ni/pt#Noun]] will only ever point to the Portuguese noun, and will never inadvertently point to the Italian noun. Editors and savvy readers can also tell from the target URL what language the link points at.
- But this technical benefit is only possible when we don't throw all the data into one undifferentiated bucket. -- Eiríkr Útlendi │ Tala við mig 02:20, 10 April 2013 (UTC)
- Why would we want to link to POS sections? I understand wanting to link to specific senses, which we can do with
{{senseid}} , but where would it be useful to link specifically to a POS header? --Yair rand (talk) 02:42, 10 April 2013 (UTC)
-
-
- There is
{{anchor}} , which can be used to allow links like foo#Dutch_noun to specific sections in those entries—that relatively minuscule minority of our 3,3 million entries—which have two Noun (or Verb, etc) sections. - -sche (discuss) 04:04, 10 April 2013 (UTC)
-
-
-
- Also, e.g., the Swedish word en (meaning "juniper", not en meaning "one") is defined at en#Etymology_2_3. —Michael Z. 2013-04-10 03:16 z
- Eirikr, it seems that the big advantage you are advertising already exists by means of
{{anchor}} . Do you have any convincing argument that we don't already have the capabilities for? —Μετάknowledgediscuss/deeds 19:19, 11 April 2013 (UTC)
[edit] Etymtree See Template:etymtree/Module:etymtree (which is horribly written at the moment, but ignoring that for a minute...), as used in Appendix:Proto-Indo-European/wódr̥ and Appendix:Proto-Germanic/watōr. The full tree is stored at Template:etymtree/ine-pro/wódr̥, but the relevant branches are pulled out by the template. Are there any serious downsides to using this system instead of the current system where trees are either duplicated across entries or missing some parts? --Yair rand (talk) 09:03, 9 April 2013 (UTC) - The main problem I see with your naming scheme is that words might have multiple sets of descendants, like *aljaną. How do you keep them apart? —CodeCat 13:05, 9 April 2013 (UTC)
- If they're both roots of separate trees, then they could just be called Template:etymtree/gem-pro/aljaną/1 and Template:etymtree/gem-pro/aljaną/2 (or something like that), I guess, and that wouldn't really cause any additional problems.
- If there are multiple words in one language with the same spelling on the same etymology tree, however... yeah, I have no idea how to deal with that. Hm... --Yair rand (talk) 22:42, 9 April 2013 (UTC)
[edit] Portuguese verb oddity In the conjugation table for Portuguese -erir verbs (see ferir as an example) it says that investir, revestir and vestir have this conjugation. This is false, but the templates are so convoluted that I can't figure out how to fix it. SemperBlotto (talk) 15:36, 11 April 2013 (UTC) - They do. They are third-conjugation verbs where the e preceding the thematic vowel becomes i in some forms. — Ungoliant (Falai) 16:05, 11 April 2013 (UTC)
- But they use {{pt-conj|xyz|vestir}}, not {{pt-conj|xyz|erir}}. SemperBlotto (talk) 16:08, 11 April 2013 (UTC)
-
-
- Ok. I changed them. — Ungoliant (Falai) 18:53, 11 April 2013 (UTC)
[edit] A Question on Modules I've been seeing contributions on modules lately. Could they replace templates, or could they both stay? Besides, I'm wondering, what are modules, and how are they used? --Lo Ximiendo (talk) 08:03, 12 April 2013 (UTC) - See WT:LUA, w:WP:LUA --Z 09:05, 12 April 2013 (UTC)
- Short answer: Modules do not replace Templates, but they complement them when complex operations (such as string manipulations) are required. Dakdada (talk) 13:40, 12 April 2013 (UTC)
[edit] Double references In the page herre, why does footnote 1 appear in two places? Isn't each <references/> supposed to list only those footnotes that were added since the last appearance of that tag? Has this changed and what is the cure? --LA2 (talk) 22:38, 12 April 2013 (UTC) The page mw:Extension:Cite/Cite.php says "In the case of multiple references-tags on a page, each gives the references defined in the ref-tags from the previous references-tag", which is how I remember it used to function. --LA2 (talk) 22:51, 12 April 2013 (UTC) - Oh, my fault, the same <ref>...</ref> tag is indeed repeated in the next section. --LA2 (talk) 22:53, 12 April 2013 (UTC)
[edit] GSOC Proposal - DICT api to Wiktionary There is now a project idea for the 2013 Google Summer of Code to make Wiktionary content available via the DICT protocol. This is in part due to bugzilla #36881. There are currently more than 15 apps with a couple million downloads between them which use Wiktionary content for dictionary reference, but as far as I am aware each uses old data dumps which have been processed by a 3rd party, some many years ago. If this project is accepted and implemented in MediaWiki, we can expect our content reuse to climb dramatically as apps would be able to retrieve our latest data. The WMF is looking for a community liaison for this project, just in case a student comes along looking to pick this one up. (This suggests to me there is interest at the foundation to see this implemented in the MediaWiki api, though that is plainly speculation on my part.) So, the first discussion point is: who would be interested in being the go-between with the developers? - Amgine/ t·e 15:36, 16 April 2013 (UTC) [edit] English category bug Was just browsing Category:English words prefixed with cyno-. If you hover over the individual entries, they are linked to with an invalid hash anchor {{{{{lang}}}}}. I suppose this is from using {{prefix}} with no language. Mglovesfun (talk) 22:43, 16 April 2013 (UTC) - I've fixed that. But problems like that could probably be prevented in the future by changing this template. The language code should really be the first parameter, so that it's clear that it's mandatory. —CodeCat 23:01, 16 April 2013 (UTC)
- Or made to be not mandatory at all, which is hypothetically at least what it does now. Mglovesfun (talk) 23:10, 16 April 2013 (UTC)
- I firmly believe that using English as the default is a bad practice, so I can't agree with that. Furthermore, all our other category templates already take the language code as their first parameter. —CodeCat 23:16, 16 April 2013 (UTC)
[edit] Documentation tab of templates and modules We've been slowly moving template documentation from /doc to /documentation. Currently, though, the "documentation" tab at the top of the page still links to /doc. How can this be changed? Also, modules should also have such a tab. —CodeCat 16:44, 18 April 2013 (UTC) - It's in Mediawiki:Common.js, under "Make tabs for citations-pages and template-documentation-pages". --Yair rand (talk) 16:45, 18 April 2013 (UTC)
[edit] Using a function in the same module It may sound like a rather dumb question but how can I achieve this? Say the code is p = {} function p.a(f) text = mw.ustring.gsub(f.args[1],'.','a') return text end function p.b(f) text = p.a(f.args[1]) return text end return p Thanks. Wyang (talk) 23:12, 18 April 2013 (UTC) - I'm not sure what you're trying to achieve, but calling p.b won't work because f.args[1].args[1] probably doesn't exist. By passing the specific argument to p.a, the new f is set to the previous f.args[1], and since it doesn't itself have a args[1], it will break. Is
text = p.a( f ) what you want to have? (Note that this would be basically basically the same thing as just setting p.b = p.a .) --Yair rand (talk) 00:19, 19 April 2013 (UTC)
-
- I see. Should be
text = p.a(f) in p.b(f). Thanks. Wyang (talk) 00:27, 19 April 2013 (UTC)
[edit] Adding a template to a category I've added Template:ru-noun-anim-1-unc (will add other templates with -unc suffix) to Category:Russian uncountable nouns. Terms that use the template now show that they belong to Category:Russian uncountable nouns, e.g. Нептун. When I open the category, it just shows 8 terms! What is wrong? Is there a DB delay or something? --Anatoli (обсудить/вклад) 23:46, 18 April 2013 (UTC) - Yes it was a database delay. Problem is, there are some singular-only entries in that category now, like Иисус which isn't uncountable, just singular-only. Unless Russian grammar doesn't make any distinction between these two. Mglovesfun (talk) 14:42, 19 April 2013 (UTC)
- Does any grammar make that distinction? —CodeCat 15:21, 19 April 2013 (UTC)
- Well English does; you can't some "some Jesus" in the same way you can say "some grain" or "some water". French does too. Mglovesfun (talk) 15:31, 19 April 2013 (UTC)
- "Yesterday I met some Jesus who was trying to sell me a watch." —CodeCat 15:43, 19 April 2013 (UTC)
- MG said "in the same way you can say 'some grain' or 'some water'". That's a different sense of "some". Chuck Entz (talk) 19:13, 19 April 2013 (UTC)
- Is the suggestion that "uncountable" means the same as "something you can have a quantity of"? To me, uncountable means that it can't consist of multiple individual instances. —CodeCat 19:24, 19 April 2013 (UTC)
-
-
-
-
-
-
- Thanks for addressing this, guys. I'm having second thoughts about this categorisation, though. The templates with "-unc" suffix are used when there are no plurals. Theoretically, personal names, names of cities, gods, etc. all can have plurals. --Anatoli (обсудить/вклад) 09:49, 20 April 2013 (UTC)
{{#invoke:a|function|text||}} seems to be treated differently from {{#invoke:a|function|text|}} , as if f.args[3] == '' then or if f.args[3] == nil then treats the former code as false but the latter as true. Is there a way to solve this? Thanks. Wyang (talk) 05:16, 20 April 2013 (UTC) - It's not a bug, an empty string is not just not the same as a nil value. Just write
if f.args[3] == nil or f.args[3] == '' then in your code. Dakdada (talk) 12:01, 20 April 2013 (UTC) - I usually write something like this: local param = args[3]; if param == "" then param = nil end —CodeCat 12:37, 20 April 2013 (UTC)
[edit] Ugly font in taxonomic name inflection line What template or other change led to the use of a hideous, too-small serif font for taxonomic name entries? See [[Datura]] or any other such entry. Who makes such decisions? DCDuring TALK 14:50, 21 April 2013 (UTC) - Looks the same as ever to me... sure it isn't on your end? —Μετάknowledgediscuss/deeds 15:03, 21 April 2013 (UTC)
- Could be. Where would I look?
- It seems to effect Translingual inflection lines, various other language inflection lines (eg. Greek), certain uses of
{{term}} , and various template-sourced text such as the content of a show/hide bar for Greek declension. It appears using both Vector and Monobook skins. DCDuring TALK 15:29, 21 April 2013 (UTC)
- Looks normal to me (Monobook under Chrome). SemperBlotto (talk) 15:34, 21 April 2013 (UTC)
- I disabled Webfonts. Is that a possibility? [Apparently not].
- It occurs with all my per-browser choices at default. I haven't noticed anything different at other websites. DCDuring TALK 15:36, 21 April 2013 (UTC)
- It appears whenever I use Template:head in principal namespace or
{{term}} : gratis - but not
{{l/en}} : gratis {{t}} : gratis (en).
- Does it have to do with the font selection used by Template:head and
{{term}} ? A CSS solution to fix my problem for me would help me, but would not be wise in case I am functioning as a miner's canary or the idiot in idiot-proofing. DCDuring TALK 15:59, 21 April 2013 (UTC)
-
-
-
- I don't see the problem in Safari/Mac or Firefox/Mac. Does it affect any of these lines?:
-
-
-
-
- lang="mul"
- class="mention-latin"
-
-
-
- What browser/version are you using? If Firefox, check your Preferences > Content > Languages > Fonts & Colors > Default Font > Advanced > Fonts for. Make sure that under both "Western" and "Other Languages" you have "Sans-Serif" set to a sans-serif font, preferably the same one. —Michael Z. 2013-04-21 16:17 z
- Bingo! Thank you very much, MZ. I didn't remember making that change, though I remember visiting the page.
- What advantage do we get from letting that kind of user preference affect the display of this website? No other site that I visited seemed to be affected by that mistaken selection, so they must exercise more control over fonts - and do so uniformly. DCDuring TALK 21:17, 21 April 2013 (UTC)
-
-
-
-
-
- Lucky guess. 🐰
-
-
-
-
-
- Few users ever touch those settings, especially now that browser support for UTF-8 has obsoleted separate code pages for each language or script. Safari only ever had default and monospace font settings, and has done away with those completely, although it does have a user style sheet. Firefox probably has a hard time dropping archaic features because of workflow.
-
-
-
-
-
- I think few sites use lang attributes at all, except perhaps on the root html element, and a tiny proportion even have reason to use things like lang="mul" or lang="und". We are simply more langed up than any other website
-
-
-
-
-
- But because we are trying to keep our lang codes correct, the few readers who want control will be able to use their browser's language preferences and user style sheets. I anticipate browser makers will continue to improve tagged language support. —Michael Z. 2013-04-21 22:55 z
[edit] Can Lua determine if a template is called from another template? Lua uses the getParentFrame function to find out the parameters that were passed to the templated that invoked it. So, for example, say that {{term}} contains {{#invoke:term cleanup|cleanup}} . Then if {{term}} is called like {{term|word}} , then within Module:term cleanup, frame.getParentFrame().args[1] will equal "word". What I would like to know is... is it possible for the module to determine, in some way, which namespace {{term}} was called from, so that it can tell the difference between, for example, {{term}} being directly in a mainspace entry, and being called from another template. —CodeCat 15:21, 21 April 2013 (UTC) - I don't believe so, no. They don't want us inspecting the whole stack, just the topmost frame. (And TBH, I think that's a good thing. If I write a usage-note template that invokes
{{term}} , I should be able to expect that it will behave the same way as if I'd put the usage-note directly in the entry.) —RuakhTALK 17:02, 21 April 2013 (UTC) - I understand that, but it would be very useful (for clearing out erroneous template uses) to be able to find which are being called through another template. Because fixing that template would probably fix many entries at the same time. It's a shame we can't use it... —CodeCat 17:08, 21 April 2013 (UTC)
[edit] Kassadbot not running? There are over 4,000 entries in Category:Requests for autoformat. SemperBlotto (talk) 11:12, 23 April 2013 (UTC) - It was blocked due to a dispute regarding Japanese entries. So yeah. -- Liliana • 19:56, 23 April 2013 (UTC)
-
- There are some allegations that you, Liliana-60, deliberately changed the KassadBot's code to pick up entries Category:Japanese romaji after a consensus was reached on how to format romaji entries - a strict format, which definition lines are generated by the template, see kochira:
==Japanese== ===Romanization=== {{ja-romaji|こちら}} -
- Which produces:
Japanese kochira - See こちら
-
- KassadBot didn't pick up any single romaji entry before that between 16 March and 7 April (after my edit to add # on a new line in Template:ja-romaji see diff). KassadBot started flagging romaji and adding them to Category:Japanese definitions needed after the 7th of April.
-
- If you're not going to start working on Japanese entries, please consider changing the code back to what it was before 7th April or make an exception for Japanese and Gothic romanisation entries. It's possible and you know it. Editors working with Japanese have already expressed their strong opinion on this and converted all (nearly 7,000) romaji entries to a new style. I apologize in advance if you didn't do anything deliberately but you didn't sound very convincing when you denied changing KassadBot's code. --Anatoli (обсудить/вклад) 22:53, 23 April 2013 (UTC)
-
-
- That's kind of the problem with negative proof; it's almost impossible to prove that one didn't do a certain thing. I could show the application's timestamp (with a last-modified date sometime in December 2012) but then people would say it's faked. I could say that anyone who runs the bot with the code I provided on this wiki (for a good reason!) would see it perform the same changes, but nobody would try and people would still not believe me. See what a difficult situation this is? -- Liliana • 05:59, 24 April 2013 (UTC)
-
-
-
- OK, I believe you. We have to move forward but I don't know what we'll do. --Anatoli (обсудить/вклад) 06:19, 24 April 2013 (UTC)
[edit] Returning the number of items under a Category Hello. I'm an administrator in the Spanish Wikcionario and they're trying to create a template for a "language of the month" feature. A user is asking if we could display a sentence like "We currently have X number of entries in" (the language of the month). Does anybody here know if there is a magic word or parameter I can use to return the number of entries in a Category? (That way the template would show the number of items in the Languageofthemonth-Español Category). Thanks in advance for your help. If you could answer in my User talk:Edgefield page here, that would be even better. Best, --Edgefield (talk) 22:41, 23 April 2013 (UTC) -
- Ah, I just saw this {{PAGESINCATEGORY:categoryname}} word; I'll give it a try. Thanks, --Edgefield (talk) 22:42, 23 April 2013 (UTC)
My browser refuses to load these two gadgets when browsing through HTTPS. Please copy the code from w:MediaWiki:Gadget-RegexMenuFramework.js and w:MediaWiki:Gadget-HotCat.js. Keφr (talk) 07:44, 24 April 2013 (UTC) I have created this module as a replacement for {{head}} . Not all of it works yet, only the categories at the moment. I've moved that part over from the template to the module, and things seem to work. —CodeCat 14:14, 25 April 2013 (UTC) - On a side note, please don't forget to add comments in the code. There are already too many uncommented modules out there. Dakdada (talk) 14:36, 25 April 2013 (UTC)
- Was this discussed anywhere before it was implemented? --Yair rand (talk) 15:47, 25 April 2013 (UTC)
- Yes, here. —CodeCat 15:51, 25 April 2013 (UTC)
[edit] New idea about translations The translation sections currently look like this: {{trans-top|furniture}} * Armenian: {{t-|hy|պահարան|tr=paharan}} * Dutch: {{t+|nl|kast|m|f}} {{trans-mid}} * Greek: {{t+|el|ντουλάπι|n|tr=ntoulápi|sc=Grek}} * Persian: {{t|fa|کمد|tr=komod|sc=fa-Arab}} {{trans-bottom}} It is possible to change it to something like this, with Lua (compare this): {{trans|furniture| * Armenian: պահարան [-] * Dutch: kast mf [+] * Greek: ντουλάπι n [+] * Persian: کمد (komod) }} I want to know if this is considered helpful by the community and is worth it. The code is more readable and is easier to edit for newbies, and the users won't have to know the ISO code of languages to edit. --Z 16:07, 26 April 2013 (UTC) - Are you saying that the translation should not be wikilinked? You realise that we would lose the link to the translated word in the "foreign" Wiktionary. SemperBlotto (talk) 16:15, 26 April 2013 (UTC)
- No, read it again. --Z 16:36, 26 April 2013 (UTC)
- It is possible, but would it actually be more efficient? This effectively turns Lua into a parser, which may slow things down rather than speed them up. —CodeCat 16:16, 26 April 2013 (UTC)
- That may be correct, we need to test it and see it in practice. I've compared
{{l}} to {{l-list}} , l-list was not slower, but a bit faster; but my current Internet connection is too slow unfortunately, so my test may be inaccurate, I would be thankful if someone else test it too. We can generalize the result to {{t}} vs. Lua method. --Z 16:36, 26 April 2013 (UTC)
- I see lots of potential for problems from different versions (including misspellings) of the language names, varying order of arguments, varying punctuation, etc. It looks simpler than it is: although there's no obvious inline code, everything has to be set up the way the module expects it, or you'll need complex, time-consuming code to allow for all the possibilities. We have bots doing this kind of parsing, but we don't have site visitors waiting for the bots to finish every time they view an entry. There's also the matter of coordinating with changes to the translation-adder and to bots, though that's secondary. Chuck Entz (talk) 17:20, 26 April 2013 (UTC)
- Some of these problems like versions of language links are really easy to fix without making the code that complex and slow. Regarding order of arguments, that's true, making things easier to work with are usually at the cost of increasing risks of using it -- the easier you can edit and change, the more things will be unintentionally messed up (although fixed order for arguments has advantage too: the code will be more similar to the output). The question is do you think it is worth it overall? --Z 18:12, 26 April 2013 (UTC)
Never mind guys, I'm disappointed. This will, at best, become something like #Module:links which eventually went nowhere, even though it was nothing but improvement. Trying to improve things by changing older ways is just a waste of time here. --Z 18:22, 26 April 2013 (UTC) - Yea, my two cents on it here, it's surely a lot more trouble than it's worth. :/ Certainly I don't think it's worth fiddling around to change stuff that much just to be a little more newbie friendly...the translation adder built in trans tables is pretty good for that IMO. As for having to know ISO codes, people (specifically newbies) should just learn to search ethnologue or whatever it is, or even search on wiktionary for
- the entry for the language
- a safer bet "Category:X language".
- User: PalkiaX50 talk to meh 18:38, 26 April 2013 (UTC)
- I support it. Semper's comment doesn't make sense, CodeCat has raised a concern without testing it, and Chuck said the code will be "complex" without giving any real examples of why it will be any more complex than what we already have. Why don't you guys give it a chance or at least bring up a real, concrete problem with it? —Μετάknowledgediscuss/deeds 23:09, 26 April 2013 (UTC)
- The complexity and the speed problems will happen because we will have to write a lot of code just to make a Lua module "understand" this new programming language that we will be creating. Essentially, we will be implementing a language within another language. So why should we reinvent the wheel when we already have a language that works and that everyone is familiar with: template code? The presence of the translation editor makes this even more redundant, because users won't even need to interact with the code. As long as there is no guarantee that this will not make things worse, I don't see how you could support it unless you do not actually have a full grasp of the implications of such a project. —CodeCat 23:24, 26 April 2013 (UTC)
- "This will, at best, become something like #Module:links which eventually went nowhere" oh come on Lua is really, really new, it's way too early to say 'eventually went nowhere'. Mglovesfun (talk) 23:42, 26 April 2013 (UTC)
- I did a quick search for names of languages indigenous to the British Isles to give a very rough idea of the complexity involved:
- Irish, Erse, Irish Gaelic, Hibernian Gaelic, Gaeilge, Gaelige, Gaedhlag, Gaedhilge, Gaedhilic, Gaeilic, Gaeilig, Gaelic
- Manx, Manks, Manx Gaelic, Gaelg, Gailck
- Scots Gaelic, Scotch Gaelic, Scots-Gaelic, Scotch-Gaelic, Caledonian Gaelic, Erse, Gàidhlig, Gaelic
- Welsh, Welch, Cambrian, Cambric, Cymric, Cymraeg
- Cornish, Kernowek, Kernewek
- Old English, Anglo-Saxon, Anglo Saxon, Anglosaxon, Englisc
- Scots, Scotch, Inglis
True, more than half of these are obscure, and unlikely to be used here- though there are enough works with odd or old usage in places like Google Books that it's hard to be categorical. It's also true there are probably other names I missed, and variants in the Celtic languages due to consonant mutation. What does the code do when someone puts "Gaelic", or "Scottish", or uses some misspelling that could be interpreted as more than one of the names for the Goidelic language- which all look very similar. There are several other cases where the same name is used for more than one language. It's true that template code has problems with people using se for sv, lt for lv or la, and so on- but at least, template code looks tricky, so people are more likely to look up the correct code. The new format looks like you can put just any old thing and the system will figure it out: in effect, it appears as if it's offering to do the nitpicky part for you. In reality, though, it will probably have its own set of constraints. It kind of reminds me of w:COBOL, and all the talk of how it was going to make coding just like writing English, and make it possible for computer-illiterates to understand it. All of this can be dealt with, but it will take lots of work to educate both the system and the editors. Even if we keep both old and new going side-by-side to space out the conversion work, there's still going to be cleanup categories with the stuff the module hasn't figured out yet, and someone's going to have to go through them. It would have to be a pretty big improvement to justify the extra work. Chuck Entz (talk) 01:50, 27 April 2013 (UTC) My opinion is that it's not a terrible idea, but Conrad's translation adder already has it beat in terms of editor friendliness. Because of this, it seems to me like we would be doing a lot of work for no gain. Also, correct me if I'm wrong, as Lua is still very new to me and I don't fully understand it, but wouldn't Lua have to essentially recompile the whole thing every time someone adds a translation? Conrad's approach has the advantage of once and done, that is, the computation is done once and then hard-coded into the entry, and does not have to figured out again. -Atelaes λάλει ἐμοί 23:56, 26 April 2013 (UTC) - Every page is reprocessed from scratch whenever it's viewed and the cache is "old". Editing and saving a page forces a refresh, but it's also refreshed after a short time (maybe less than a day but I don't know for sure). —CodeCat 00:00, 27 April 2013 (UTC)
- I think he was looking at the creation of the template code as sort of a precompiling into a more machine-friendly format which wouldn't have to go through all the trouble of parsing every time. Of course, the template code has its own overhead, so it probably wouldn't make that much of a difference. Chuck Entz (talk) 01:01, 27 April 2013 (UTC)
[edit] Help test the new account creation and login Hi all, After many weeks of testing, We (the editor engagement experiments team) are is getting close to enabling redesigns of the account creation and login pages. (There's more background about how we got here and why our blog post.) Right now are trying to identify any final bugs before we enable new defaults. This is where we really need your help: for now, we don't want to disrupt these critical functions if there are outstanding bugs or mistranslated interface messages. So for about a week, the new designs are opt-in only for testing purposes, and it would be wonderful if you could give them a try. Here's how: If you have questions about how to test this or why something might be the way it is, I'd definitely check out our step-by-step testing guide and the general documentation. Many thanks, Steven (WMF) (talk) 19:48, 26 April 2013 (UTC) | |