| Wiktionary:Grease pit/2013/April Apr 7th 2013, 23:03 | | | | Line 112: | Line 112: | | | | | | | | == Very minor bug == | | == Very minor bug == | | | + | : ''Relevant script:'' [[MediaWiki:Gadget-PatrollingEnhancements.js]] | | | | | | | | In the recent changes script that provides for admins the red bar at the bottom right, when I type in that bar it scrolls down the page. It's so innocuous I haven't bothered to report it yet. So if I type a really long deletion summary, I then have to scroll back up to click the red 'd' link to delete the offending page. [[User:Mglovesfun|Mglovesfun]] ([[User talk:Mglovesfun|talk]]) 11:24, 7 April 2013 (UTC) | | In the recent changes script that provides for admins the red bar at the bottom right, when I type in that bar it scrolls down the page. It's so innocuous I haven't bothered to report it yet. So if I type a really long deletion summary, I then have to scroll back up to click the red 'd' link to delete the offending page. [[User:Mglovesfun|Mglovesfun]] ([[User talk:Mglovesfun|talk]]) 11:24, 7 April 2013 (UTC) | | | :What browser/system are you using? I (FWIW to whomever troubleshoots/fixes this issue) can't reproduce that behaviour; I can type lots of text into the deletion-summary bar, and the page doesn't scroll. [[User:-sche|- -sche]] [[User talk:-sche|(discuss)]] 22:43, 7 April 2013 (UTC) | | :What browser/system are you using? I (FWIW to whomever troubleshoots/fixes this issue) can't reproduce that behaviour; I can type lots of text into the deletion-summary bar, and the page doesn't scroll. [[User:-sche|- -sche]] [[User talk:-sche|(discuss)]] 22:43, 7 April 2013 (UTC) | | | + | | | | + | : Hmm. Can you experiment a bit, and describe the behavior a bit more precisely? For example: | | | + | :* If you type a single letter, then wait a moment, then type another letter, do you find that it scrolls a certain distance when you type the first letter, and then scrolls the same distance when you type the second letter? | | | + | :* Or do you find that the scrolling only happens when you hit the space bar? (In many browsers, the space bar can be used to scroll down the page, but of course it's not supposed to do that when you type an actual space into an actual text field!) | | | + | : —[[User: Ruakh |Ruakh]]<sub ><small ><i >[[User talk: Ruakh |TALK]]</i ></small ></sub > 23:03, 7 April 2013 (UTC) |
Latest revision as of 23:03, 7 April 2013 Current code can do {{l}}'s job, and {{l}} will use the module instead of its current code soon. The aim of the module is generally handling wikilinks, though -- not just in {{l}}, but in {{term}}, head templates, and other similar templates that create wikilinks. Some new features have been proposed at Template_talk:l#Lua-ising. The code for the features has been written and tested, we just need to gain official community consensus to implement it. Any thoughts or suggestions would be welcomed. --Z 04:28, 1 April 2013 (UTC) - Have you tested it to make sure it works in all cases that {{l}} works, and that it doesn't do anything it shouldn't? Also, what is the purpose of Module:useful stuff? "detect_script" in particular doesn't seem like it does anything useful. And the list of languages that have automated transliteration should be in Module:languages. I also warned you not to start adding all kinds of extra code to this until we're sure that it works the way it should. —CodeCat 13:41, 1 April 2013 (UTC)
- Assuming "detect_script" does what it seems to based on its name, that would be extremely useful. Several templates for multiscriptal languages like Tatar, Ladino, and Japanese have parameters that require the user to input what script an entry is in. If we can scrap that, that's be great. —Μετάknowledgediscuss/deeds 14:20, 1 April 2013 (UTC)
- But what do you do when the word contains characters in multiple scripts? —CodeCat 16:40, 1 April 2013 (UTC)
- That doesn't happen in Tatar or Ladino. It does happen in Japanese, and I'm not sure how it works. For example, アメリカ合衆国 is in both katakana and kanji, but is marked as katakana (in the template, that's kk). We'll have to ask a Japanese editor. —Μετάknowledgediscuss/deeds 00:45, 2 April 2013 (UTC)
-
-
-
-
- Why use
kk, the language code for Kazakh? In case it matters, the Japanese ISO script codes are:[1]
-
-
-
-
-
Hira: Hiragana Kana: Katakana Hrkt: Japanese syllabaries (alias for Hiragana + Katakana) Jpan: Japanese (alias for Han + Hiragana + Katakana)
-
-
-
-
- —Michael Z. 2013-04-03 21:38 z
- This is totally off-topic, but I guess it's a valid complaint about the template. The answer is that it's faster to type, just like pl= (code for Polish, means plural in templates) or tr= (code for Turkish, means transliteration in templates). In cases like these, editors' ease definitely outweighs using ISO script codes, because it really doesn't matter which we use. —Μετάknowledgediscuss/deeds 23:45, 3 April 2013 (UTC)
- Where is the code for the version with the proposed features?
- Automatically detecting script sounds like a very good idea, imo. --Yair rand (talk) 01:22, 4 April 2013 (UTC)
- Some are removed by CodeCat and you can find them in older revisions, some were moved to this module, others are in commented part of the code, e.g. recognizing reconstructed terms from "*" and linking to appendix is in prepare_title(). --Z 01:42, 4 April 2013 (UTC)
- Can somebody please help me work out how to implement script recognition in {{tt-pos}}? —Μετάknowledgediscuss/deeds 02:31, 4 April 2013 (UTC)
- I'm not opposed to these innovations in principle, but I do think that we should first get {{l}} to work with this module first, and keep it that way for at least a week or two so that we can be sure there are no unexpected problems. —CodeCat 03:20, 4 April 2013 (UTC)
- Why would we want to get {{l}} working with it first? That might could be a while... —Μετάknowledgediscuss/deeds 04:22, 4 April 2013 (UTC)
- Currently detect_script() can't be invoked from templates, it's a better idea to rewrite that template in Lua. --Z 03:29, 4 April 2013 (UTC)
- Is it? I was hoping to have a model off which I might be able to design more templates with these features. —Μετάknowledgediscuss/deeds 04:22, 4 April 2013 (UTC)
- It would be possible after Lua-ization of {{l}} and {{head}} and adding the ability of detecting scripts to them. --Z 05:08, 4 April 2013 (UTC)
- Regarding Japanese, I have no idea about how its writing system works, but it's possible to find katakana characters of a word and tag it with Kana class, and other non-katakana characters of the word (if there is any) would be kanji, I assume? If so, it's easy to fix. Does similar thing happen in any other language? --Z 03:29, 4 April 2013 (UTC)
- Not that I can think of, but we should assume so just to be safe. —Μετάknowledgediscuss/deeds 04:22, 4 April 2013 (UTC)
- The module is tested, the only problem is gender/number part -- output of Module:gender and number and those of gender/number templates are not identical. That's not much of a problem though, we can use the gender templates in the module for now. --Z 05:32, 4 April 2013 (UTC)
- Just use the module. The output of the module can always be changed if necessary, that isn't a good reason not to use it. —CodeCat 22:59, 5 April 2013 (UTC)
[edit] Substring module Thanks, Z! New question: do we have a basic string manipulation module, just to store stuff like taking a substring of a certain length from the end of a word, etc? If not, should I create Module:string or something? —Μετάknowledgediscuss/deeds 00:23, 5 April 2013 (UTC) - NP, no that's not needed, for this certain task you can simply use string.sub(). --Z 00:39, 5 April 2013 (UTC)
- But I want to invoke it directly in an un-Luacized template. That's why I reckon there should be a separate module for it. —Μετάknowledgediscuss/deeds 22:51, 5 April 2013 (UTC)
-
- I see a need to decompose words in any script into components, including any diacritics and ligatures. Use: say you want to check how to pronounce a word in a complex script with diacritics - Burmese, Hindi, Bengali, Thai, Arabic, Hebrew, etc. A Devanagari syllable रा (rā) can't be looked up in Wiktionary:Hindi transliteration because it's र + ा and you can't take out the diacritic ा from रा to look it up. Some word processors allow to break strings into parts. So, yes, please. Not just the substring but a break up.
-
- Module:ko-hangul has the function syllable2Jamo, which in debug mode shows individual jamo for each hangeul (한 (han) = ㅎㅏㄴ (h a n). Need to make it break up hangeul in the run mode. Tried to do it with "syllable2JamoSep" but didn't work. --Anatoli (обсудить/вклад) 00:41, 5 April 2013 (UTC)
-
-
- If I understand you correctly, you need
mw.ustring.gsub(text, "(.)", "%1 ") (try print(mw.ustring.gsub("रा", "(.)", "%1 ")) in console). --Z 00:58, 5 April 2013 (UTC)
- Module:String exists on Wikipedia. Because it doesn't exist here yet, I copied the entire code and added extra bits to it when I wrote Module:bo-translit. Wyang (talk) 01:01, 5 April 2013 (UTC)
-
- Great stuff, thank you! I also tried Thai "เค็ม" = เ ค ็ ม and Arabic "اَلْلُغَةُ ٱلْعَرَبِيَّةُ" = ا َ ل ْ ل ُ غ َ ة ُ ٱ ل ْ ع َ ر َ ب ِ ي َ ّ ة ُ. --Anatoli (обсудить/вклад) 01:11, 5 April 2013 (UTC)
-
-
- Wyang, it seems you're able to do auto-translit for a few complicated script languages, especially those you're familiar with, you've done Burmese before in the Chinese Wiktionary, haven't you? I'm especially keen if you could do Hindi, Bengali, Thai, Lao, Burmese and Khmer in any order, possibly Sinhalese. Korean module needs to be made better as well. I'm happy to assist with testing and getting/checking for translit. rules. I'm only familiar to some degree with Korean, Hindi and Thai. Angr is our Burmese expert and Stephen G. Brown knows a bunch, including Khmer and Telugu (no tones in Khmer, so it's simpler than some others). ZxxZxxZ has done a great job with Arabic and Persian but you can only do that much with partially phonetic languages. --Anatoli (обсудить/вклад) 01:23, 5 April 2013 (UTC)
- Wyang knows a lot about Burmese himself, I doubt he'll need much if any help from me for it! —Angr 10:18, 5 April 2013 (UTC)
-
-
-
-
- OK. Anyway, I hoped that function
print(mw.ustring.gsub("လုံချည်", "(.)", "%1 ")) could also be used for reading out individual characters, so that one could look up each character in e.g. WT:MY TR but are we missing some characters in လ ု ံ ခ ျ ည ်? I can't find some characters, e.g. ု. --Anatoli (обсудить/вклад) 10:59, 5 April 2013 (UTC)
-
-
-
-
-
- ု alone is "u." in MLCTS, ု+ ံ is "um". They are both on the my-translit page. As for the transliteration modules, I'll try to get more familiar with Lua first. While certain string decomposition functions are easier to use than wiki code, some others seem less straightforward, for example the equivalent of {{#switch: (?) Wyang (talk) 11:06, 5 April 2013 (UTC)
- What exactly is the purpose of all these string functions when most of them already have more complete Scribunto equivalents? Isn't this just reinventing the wheel? —CodeCat 22:57, 5 April 2013 (UTC)
[edit] Documentation subpage tab for templates As an example, {{head}}, at the top the tab is still Template:head/doc but we're migrating these to /documentation (Template:head/documentation). I guess there's a MediaWiki page somewhere that needs updating. Please. Mglovesfun (talk) 12:23, 5 April 2013 (UTC) - Yes, but currently the majority still uses /doc. Would it be possible for it to support both, until all the subpages have been moved over? {{documentation}} already does this (it shows /doc if it exists, but prefers /documentation otherwise) —CodeCat 14:17, 5 April 2013 (UTC)
[edit] An idea for a new way to make form bots This idea just kind of came to me but I think it could be very useful. The way that User:MewBot and probably all the other bots currently work is that they parse the invocation of the template, and then try to mimic the template as closely as possible. This is really a lot of work and it's not ideal, because it means all the code has to be duplicated and kept synchronised. So I thought... why not let another template or Lua generate the forms in a machine-readable format? That way, the bot only has to understand the output, but no longer has to duplicate any of the intricacies of the template/module. I have added this to Module:nl-verb and less than 5 minutes later I already have a working example. :) See User:CodeCat/bot example. To use it on any Dutch verb table, just add the parameter bot=1 to the template. It is really easy to do as long as your template or module has a strict separation between the part that generates all the forms and the part that displays the table. By writing a second output function that displays a list of forms instead of a fancy table, you can get this result. It can even be done without Lua at all, but Lua does make it a lot easier. So what can we do with this? In its current form, a bot script could "find" the inflection template on a page like it does before, but it could then add the bot=1 parameter and expand the template (via the MediaWiki API, which is built into the Python wikibot framework). It can then parse the machine-readable output and use that to create entries for each of the forms as before, instead of having to generate all the forms itself. However, this concept could be taken further. The template or module that generates the machine-readable format could actually generate the full form-of entries itself, so that the bot doesn't even need to parse the output but could just flat out create entries straight from it. This, in turn, would open up one huge door: a single bot that can do any inflection table, in any language, without modifications, as long as the template/module generates the proper output. —CodeCat 01:31, 6 April 2013 (UTC) - I've now made these changes to User:MewBot, and it seems to work ok with the few test entries I've tried it on. —CodeCat 22:01, 6 April 2013 (UTC)
- Looks good to me. —RuakhTALK 23:54, 6 April 2013 (UTC)
[edit] toolserver I've tried to use several toolserver utilities yesterday and this morning, such as SUL accounts, and I have not been able to make a connection. Is toolserver temporarily down, or is it gone? I remember someone saying that the toolserver tools were being migrated to Wikimedia Labs. If the SUL accounts tool is now at Wikimedia Labs, is there a way to fix the links here (SUL accounts link appears at the bottom of anyone's contributions page, such as Special:Contributions/Stephen_G._Brown)? —Stephen (Talk) 06:16, 6 April 2013 (UTC) - The toolserver has issues, but it is unrelated to tools migrations. Dakdada (talk) 15:40, 6 April 2013 (UTC)
-
- Thanks. It seems to be working again, although very slow. —Stephen (Talk) 08:48, 7 April 2013 (UTC)
[edit] Limits on display of "orange link for missing language section" If one checks the appropriate box in per-browser preferences, links display in orange instead of blue "if the target language is missing on an existing page" (actually if the specified section of whatever kind is missing), eg, plate#Latin. plata#Latin displays blue though there is no Latin section at [[plata]]. Does anyone know whether there some kind of limit on the number of headings that are searched in the operation of the this feature? DCDuring TALK 12:28, 6 April 2013 (UTC) - No idea, but the reverse happens too. In the etymology sections of သစ်, 薪#Mandarin, 薪#Cantonese, 新#Middle Chinese, and 新#Cantonese all appear orange even though those sections do exist. —Angr 12:36, 6 April 2013 (UTC)
- Angr, that's a different problem. I'm pretty sure that's because none of the sections you linked to has a definition yet. —Μετάknowledgediscuss/deeds 16:45, 6 April 2013 (UTC)
- Uncategorized sections come up orange, yes. Mglovesfun (talk) 18:21, 6 April 2013 (UTC)
- OK, that's good to know. But what about DCDuring's issue? Why is plata#Latin blue? —Angr 18:28, 6 April 2013 (UTC)
- I searched the plata page for the word "Latin". It appeared in a context note in the second Spanish def.
- After deleting that context note, so the word "Latin" doesn't appear on that page, the plata#Latin link now appears in orange.
- Yair, that looks like a bug, maybe in page parsing -- would that be hard for you to fix? -- Eiríkr Útlendi │ Tala við mig 19:47, 6 April 2013 (UTC)
- No, that's the same problem. It's not that you removed Latin from the text, it's that you removed the page from Category:Latin American Spanish. The script works by examining a page's categories; if the page appears in a category that starts with a given language-name, then the script infers that the page has a section for that language. (In particular, the script does not download and parse the page. That would give more accurate results, but would be prohibitively expensive.) —RuakhTALK 00:09, 7 April 2013 (UTC)
- And that answers the lingering question I had about why orange appears when I try to link to a PoS or Etymology section in English. DCDuring TALK 00:40, 7 April 2013 (UTC)
- Does the script not look at hidden categories? 新 has a hidden Cantonese category (definition needed). Why not use it? DCDuring TALK 00:44, 7 April 2013 (UTC)
-
-
- Re: "Does the script not look at hidden categories?": Correct, it doesn't. Re: "Why not […] ?": Not being the author, I can only speculate, but it sort of makes sense to me: if our only Cantonese category is Category:Cantonese definitions needed, then we arguably don't really have a Cantonese entry. I mean, sure, we've got the ==Cantonese== L2 header, but we don't even identify the POS? —RuakhTALK 02:20, 7 April 2013 (UTC)
- I don't know if it would help or even be useful, but we could restrict the script further by requiring that a PoS category have a specific name. The script would have a list of possible PoS names which it could choose from, and it would make the link orange if it finds no match. Since "American Spanish" isn't a PoS category, it would fix the problem above. —CodeCat 02:31, 7 April 2013 (UTC)
- That doesn't seem like it should impose much of a performance penalty. (Does Lua/Scribunto help?) If there is any significant performance penalty, should just learn to live with the problem.
- I guess there could be other instances of a name of regional dialect or dialect grouping starting with a language name. DCDuring TALK 04:00, 7 April 2013 (UTC)
- No performance penalty, no. (Also, that part of it would be handled entirely on the client-side (your browser), via JavaScript, so even if it did have a performance impact, the considerations would be a bit different than for stuff that runs on the server-side.) —RuakhTALK 05:26, 7 April 2013 (UTC)
- have#Etymology also comes up orange (well, yellow to me) because there are no categories starting with the word Etymology. Mglovesfun (talk) 11:33, 7 April 2013 (UTC)
- We could apply the same idea to that as well. If the script is able to recognise which categories are valid, maybe it could also tell a valid language from an invalid one. The problem, of course, is that there are hundreds of languages, so putting them all into the script might be overkill. I guess this is yet another example of a case where linking to non-language sections just doesn't work. And I kind of agree with that anyway, because have#Etymology would link to the first Etymology section on the page. That works out right in this case, but what about cantar#Conjugation? And never mind if we want to link to the etymology of another language, then you're out of luck... —CodeCat 13:02, 7 April 2013 (UTC)
- It only matters for those who select the preferences box. It doesn't really matter for the users we should care most about: casual users. The simple and cheap part (PoS) may be worth installing in the script, not the language part.
- I still have hopes that links to English L2s can be to be more narrowly directed to the appropriate L2 headers like Etymology n and the PoS headers instead of running the risk of confusing users at our longer entries. DCDuring TALK 14:07, 7 April 2013 (UTC)
- I prefer the opposite, that all languages can be treated equally so that people don't get confused when something that works for English doesn't work for any other language. That doesn't mean that we wouldn't be able to link to specific sections, but we should be able to link to the specific section of any language. Currently, Mediawiki equates sections with the actual name of the header, so it assumes that every header will only ever appear once on a page. That is really a rather strange assumption. —CodeCat 14:11, 7 April 2013 (UTC)
- English is the host language. It already behaves differently.
- The less attractive we make this for normal users the less we will track real English. Plenty of users are confused by the existence of uppercase entries for English words that don't contain English sections, just German or Translingual. Why not finish the job by putting English in its alphabetical position in multilanguage entries and encourage non-English-language discussions for non-English entries? We already run the risk of turning this wiktionary into one that doesn't track current UK or US English, but some blend of what Webster 1913 tracked and Globish. And there will be fewer to challenge the dated, archaic, and obsolete glosses in our FL entries. But it will still be fun for polyglots. DCDuring TALK 14:25, 7 April 2013 (UTC)
- "Bla bla bla think of the children, it will be doom if we allow such transgressions". Ok, can you actually read what I am saying and not go off on a panic spree? —CodeCat 16:07, 7 April 2013 (UTC)
[edit] Very minor bug - Relevant script: MediaWiki:Gadget-PatrollingEnhancements.js
In the recent changes script that provides for admins the red bar at the bottom right, when I type in that bar it scrolls down the page. It's so innocuous I haven't bothered to report it yet. So if I type a really long deletion summary, I then have to scroll back up to click the red 'd' link to delete the offending page. Mglovesfun (talk) 11:24, 7 April 2013 (UTC) - What browser/system are you using? I (FWIW to whomever troubleshoots/fixes this issue) can't reproduce that behaviour; I can type lots of text into the deletion-summary bar, and the page doesn't scroll. - -sche (discuss) 22:43, 7 April 2013 (UTC)
- Hmm. Can you experiment a bit, and describe the behavior a bit more precisely? For example:
- If you type a single letter, then wait a moment, then type another letter, do you find that it scrolls a certain distance when you type the first letter, and then scrolls the same distance when you type the second letter?
- Or do you find that the scrolling only happens when you hit the space bar? (In many browsers, the space bar can be used to scroll down the page, but of course it's not supposed to do that when you type an actual space into an actual text field!)
- —RuakhTALK 23:03, 7 April 2013 (UTC)
| |