User talk:DTLHS Jul 26th 2013, 00:28, by Metaknowledge | | Line 124: | Line 124: | | ::::Asante sana ! It's not necessary to run it again, hardly anybody has edited Swahili for years until I started recently. —[[User:Metaknowledge|Μετάknowledge]]<small><sup>''[[User talk:Metaknowledge|discuss]]/[[Special:Contributions/Metaknowledge|deeds]]''</sup></small> 23:51, 25 July 2013 (UTC) | | ::::Asante sana ! It's not necessary to run it again, hardly anybody has edited Swahili for years until I started recently. —[[User:Metaknowledge|Μετάknowledge]]<small><sup>''[[User talk:Metaknowledge|discuss]]/[[Special:Contributions/Metaknowledge|deeds]]''</sup></small> 23:51, 25 July 2013 (UTC) | | ::::: If you are familiar with pywikipediabot, you can make a script that fetches all words in a certain category, then a list of all pages that transclude a template, and subtract the two (set difference). That will give you all pages that don't transclude that template. {{User:CodeCat/signature}} 23:55, 25 July 2013 (UTC) | | ::::: If you are familiar with pywikipediabot, you can make a script that fetches all words in a certain category, then a list of all pages that transclude a template, and subtract the two (set difference). That will give you all pages that don't transclude that template. {{User:CodeCat/signature}} 23:55, 25 July 2013 (UTC) | | + | ::::::I've tried using AWB to do it, but I don't know how to get virtual Windows for free, so I simply haven't enough time on computers running Windows. If you feel inspired to send me an email explaining what to do so I don't have to keep bothering other people, by all means do so, but keep in mind my ignorance :) —[[User:Metaknowledge|Μετάknowledge]]<small><sup>''[[User talk:Metaknowledge|discuss]]/[[Special:Contributions/Metaknowledge|deeds]]''</sup></small> 00:28, 26 July 2013 (UTC) |
Latest revision as of 00:28, 26 July 2013 Please see this discussion. Thanks! --Μετάknowledgediscuss/deeds 06:09, 17 September 2012 (UTC) Differentiate The Left-Hand Side? Don't Tease a Lady about Her Size? Deflects Tournament Lances and Heraldic Shields? —RuakhTALK 12:27, 18 September 2012 (UTC) - Wait, what am I saying? Usernames are usually nouny, not verby!
- Definitely The Largest Hoop Skirt?
- Danny's Tree, Lawn, and Home Solutions?
- Dallas, Texas Local High Schools?
- —RuakhTALK 13:09, 18 September 2012 (UTC)
Do you have access to this account anymore, or not? If not, I'd like to propose removing its sysop rights and (if you like) granting your current account sysop rights. (Though if you can't actually prove you are Nadano, I suppose some people might want to wait until you've used your current account for more than just a few weeks before empowering it.) Cheers, - -sche (discuss) 04:59, 20 September 2012 (UTC) - I have no access to the account and I don't want sysop rights. DTLHS (talk) 18:58, 20 September 2012 (UTC)
Your bot and indexing[edit] So, I keep deleting and redeleting and redeleting a bunch of indices, like Index:Proto-Greek, and all the non-deletion edits seem to be done by your bot. Can you explain what's going on (i.e. why they keep getting created and tagged with {{d}} )? --Μετάknowledgediscuss/deeds 15:51, 23 September 2012 (UTC) - Sorry, it uses a rather crude comparison to try and detect which pages have been emptied between runs. I'll try to prevent that from happening again. DTLHS (talk) 16:11, 23 September 2012 (UTC)
- OK, thank you. By the way, this proto-indexing is really awesome, but a bunch of living languages need indices too. Currently a few (like Italian) are upkept by User:Conrad.Irwin and a few (like Telugu) are upkept by means of manual hacks, but for most languages they are non-existent or woefully incomplete. Just a suggestion if you ever feel like exanding :) --Μετάknowledgediscuss/deeds 20:50, 23 September 2012 (UTC)
Can your bot handle proto-forms being called by an {{etyl}} /{{recons}} combination? For example, please see גע־#Etymology. —Μετάknowledgediscuss/deeds 01:19, 8 October 2012 (UTC) - Maybe?
{{proto}} makes indexing easy since all of the information is contained in one template. If we move to using {{etyl}} with {{recons}} I won't be able to match the term ("ga-") with the categorizing template (etyl|gem-pro|yi) without guessing. DTLHS (talk) 01:25, 8 October 2012 (UTC) - Actually, I'm going to change "maybe" to "no". This format (with the split between source language and source term) is the reason why it's impossible to generate any kind of structured data from Wiktionary etymologies in their current state. DTLHS (talk) 01:57, 8 October 2012 (UTC)
- I think that User:CodeCat strongly favors the split approach, so I think now might be the time to inform her of this problem, if you're interested in doing so. —Μετάknowledgediscuss/deeds 02:06, 8 October 2012 (UTC)
- The indexes don't really need the
{{etyl}} template, do they? I mean, is it relevant for an index that a certain Proto-Germanic term is the origin of an English word? —CodeCat 17:36, 8 October 2012 (UTC) - Is it relevant? I guess that depends what you think the index is supposed to accomplish. I now can't distinguish between mentions of a reconstructed term and actual derivations. I started the project with the goal of showing etymological relationships, but now that that's no longer possible I could move into a more general approach. DTLHS (talk) 17:55, 8 October 2012 (UTC)
- Oh! I never realised that. I always used the indexes as just an easy way to see which reconstructed terms are being linked to. It has been very helpful in finding variant forms and fixing them to link to the entries we already have. For example if we have *-mn̥ and someone provides a link to *-men- instead, it will show up in the index so it can be spotted easily. It has also helped me find entries that do not use the canonical spelling for reconstructed terms (like ð instead of d in Germanic, or gh instead of gʰ). But in any case... I don't think things are as desperate as they seem. In almost all cases (like 99.9%) if the
{{recons}} template indicates a derivation, it is immediately preceded by {{etyl}} , or occasionally by one or more additional {{recons}} s preceded by {{etyl}} . So a derivation always has the format {{etyl|src|dst}} {{recons|word1|lang=src}}, {{recons|word2|lang=src}}... (both templates have optional parameters of course). If you could detect this, then it would be trivial to extend your bot to other reconstructed terms like in Vulgar Latin, and even attested languages that use {{term}} instead, like Old English (since the format of {{recons}} is identical to {{term}} in every way). —CodeCat 18:16, 8 October 2012 (UTC) - Like I said, maybe. I'm skeptical that anywhere near 99.9% of etymologies can be assumed to follow that format, but I'll have to do some analysis to tell for sure what kind of quality I could get. Expect something later next week (no sooner than the 17th). DTLHS (talk) 20:02, 8 October 2012 (UTC)
- Also, maybe more relevant is that not all reconstructions are listed as etymologies of others. For example, the
{{proto}} template allowed you to set lang= to empty, to indicate that it's merely to provide a link, rather than establish any etymology. If your bot could handle that case, could it not do the same here? —CodeCat 17:39, 8 October 2012 (UTC) - Sure, that would be useful. But
{{recons}} already has a lang= parameter, so this would have to be something like noderiv=1. DTLHS (talk) 17:55, 8 October 2012 (UTC)
NadandoBot.[edit] Would you like NadandoBot to be renamed to something more DTLHS-y? —RuakhTALK 12:16, 24 September 2012 (UTC) - Nah, the old name is fine. DTLHS (talk) 23:40, 24 September 2012 (UTC)
Why is that not a synonym? I didn't add it as one, so I don't care, but someone thought it was. If it is obsolete, then it might need some indication, but the flux of taxonomic terms seems considerable. The general question of how to present older taxa and those with disputed circumscription is important for this class of entries. We need to present not just the "latest-and-therefore-best" sense, but the others as well. We need to favor the current ones in some way, even when they are not the most common in use. We have {{defdate}} available for marking when a new term, more correct by current understanding came into use. DCDuring TALK 13:51, 11 October 2012 (UTC) - To be clear, you're talking about diff this edit where I added Feliformia as a synonym? I removed it because from what I can see they are not synonymous- Fissipedia has (had) several families that are not in Feliformia. I apologize if this isn't what you meant.
- I agree with you about the importance of older taxa- I think that a comprehensive resource for obsolete synonyms and reclassifications would be hugely useful. I don't think that our current structure facilitates that when dealing with historical meanings in complex classification schemes. DTLHS (talk) 02:42, 12 October 2012 (UTC)
- No problem about Fissipedia. I hadn't realized it was your addition.
- Our existing structure, using
{{taxon}} seems OK for valid current taxa. The difficulty is that each taxon is principally defined relative to its parent. For an invalid taxon that relationship is inherently problematic. For the older ones a large portion of the system may have changes so that the parent name is also obsolete. - Extinct taxa have their own challenges as their relationship to the structure of the living taxa seems quite problematic and subject to change as well.
- Even for current taxa there is a constant interposing of additional layers.
- For now I am hoping to categorize our taxon by their level (lost of progress, see Category:mul:Taxonomic names, get the etymology right, get images, and eliminate redlinks. I had started Wiktionary:Taxonomic names, but the most urgent tasks don't really require any votes or much expertise. Doing the urgent tasks is increasing my lexicographic knowledge about taxa a bit, but hardly at all my biological expertise. DCDuring TALK 05:55, 12 October 2012 (UTC)
FYI. :-) —RuakhTALK 20:47, 1 November 2012 (UTC) No pressure, but if you're able to do it, I'd really appreciate it. If not, or if you don't want to, I understand. Thanks! —Μετάknowledgediscuss/deeds 04:06, 12 December 2012 (UTC) - Ok. To be clear, you're asking for pages with
א פ װ ױ ײ in the title, excluding pages with {{yi-unpointed form of}} and excluding entries with אַ אָ פּ פֿ in the title? DTLHS (talk) 04:12, 12 December 2012 (UTC) - Almost. All right except that
א is OK if it is in the beginning (i.e. rightmost edge) of a title and immediately followed by either of the following: י ו . Thanks! —Μετάknowledgediscuss/deeds 04:15, 12 December 2012 (UTC) - In my defence, the reason that I keep typing < and > backward is that I'm switching between Hebrew and English keyboards faster than I'm typing, and the right-to-left equivalent is coming out :) —Μετάknowledgediscuss/deeds 04:24, 12 December 2012 (UTC)
Well, I only found 14 entries: - פייוול
- פײװל
- אדרבא
- צײַטווײַליק
- מוטער־צײכן
- זײַ געזונט
- זײַ
- זײַנע
- מײַנע
- דײַנע
- סײַ
- ־קײַט
- ־הײַט
- בײַכער
I'll check my code, I don't know if you were expecting more. DTLHS (talk) 04:56, 12 December 2012 (UTC) - Well, there is something wrong with your code, but there is also something wrong with my instructions. I forgot to say that
ײַ is one of the exceptions (i.e. it's also OK). I'm not sure why you only caught some pages like that, but not others. In any case, at least a few of these aren't false positives, so I'll fix them. Thanks! —Μετάknowledgediscuss/deeds 05:42, 12 December 2012 (UTC) In this revision, as Nadando, you added an etymology for the nickname sense. It seems implausible and I can find no mention of it at Google Books, Scholar, or at the OneLook references. Where did you find it? Why do you believe it? DCDuring TALK 04:37, 26 February 2013 (UTC) - It was actually added in this edit by Tonyrail. I just moved it to its own etymology section. I have nothing to say about the actual facts of the etymology. DTLHS (talk) 07:09, 26 February 2013 (UTC)
deletion templates[edit] From a procedural standpoint you are correct about the RFV. I will use this tag in the future. On the other hand, the references I removed from those words pointed to blank pages. Would you agree? Kultur (talk) 21:19, 4 April 2013 (UTC) - No, because they could be valid entries some day. Red links in related terms sections are not a problem. DTLHS (talk) 22:44, 4 April 2013 (UTC)
Language code templates called directly[edit] How exactly are you fixing these? Just by substituting them? —CodeCat 21:50, 24 June 2013 (UTC) - Most of them are in etymology sections, so with etyl. Another large category is malformed headline templates. DTLHS (talk) 21:52, 24 June 2013 (UTC)
- Etyl should only be used when the language is part of the origin of the word. Otherwise it adds the wrong categories. —CodeCat 22:07, 24 June 2013 (UTC)
- That's what I've been doing, but thanks for the reminder. DTLHS (talk) 22:09, 24 June 2013 (UTC)
- What I mean is that if you use it for a cognate, the entry will be added to the wrong category. Have all the uses so far been intended as the actual etymological origin of the word? —CodeCat 22:14, 24 June 2013 (UTC)
- {{etyl|langcode|-}} will not categorize. DTLHS (talk) 22:16, 24 June 2013 (UTC)
- Ok... but what is the advantage in that? —CodeCat 22:17, 24 June 2013 (UTC)
- 1. It provides the correct classes and consistency to show that this language is involved in the etymology but not directly related 2. It removes direct language template uses, eventually allowing the template to be deleted 3. It provides consistency in cases where language names could be changed (ideally all language names would be provided through templates, but that's obviously not going to happen any time soon). DTLHS (talk) 22:22, 24 June 2013 (UTC)
- Well 3 doesn't make much sense currently, like you said. As for 1, I am not sure there really is much value in it because it replaces the name with a call to a moderately complex template, without any outward benefit - only the wikicode is different. I'm not sure if the reason you gave is really a very good reason. 2 is good though. —CodeCat 22:31, 24 June 2013 (UTC)
- I think it's acceptable to use etyl in these cases. (I use it, too, and DTLHS and I aren't the only ones who do.) Another benefit is that etyl can cause the displayed text to link to a language's entry or WP page. Indeed, first Ruakh and later, IIRC, you (CodeCat) proposed to re-enable that feature by default, but no-one's done so yet AFAICT... - -sche (discuss) 02:47, 25 June 2013 (UTC)
- I don't know if it's because they're the only pages I have watchlisted and notice, or because they're the bulk of the affected pages, but I notice that a lot of the direct calls to language templates that you're fixing are in etymology sections I added. I'm sorry I didn't format them with
{{etyl}} when I added them! I usually added them to Appendix:English terms of Native American origin first and then copied the appendix entries into the main namespace... and apparently regularly forgot to add {{etyl}} when I did so. Thanks for cleaning them up. - -sche (discuss) 17:26, 28 June 2013 (UTC) Thanks for making Wiktionary:Todo/English verb form cleanup, which showed lots of incorrectly named pages. Any chance you could make the page again, as (at least, I think) all the previous entries have been corrected. Perhaps this time you could do the same for third-person singular present tense, as I've found a few of them manually (commits suicide for example). Thanks a lot! --Semper amore (talk) 14:52, 30 June 2013 (UTC) - I'll do it once the next dump comes out. There was a bug in my code for the present tense forms, so it should catch those now. DTLHS (talk) 17:00, 30 June 2013 (UTC)
Swahili noun forms[edit] I was wondering if you could generate a list of all the pages which have ===Noun form=== or ====Noun form==== under ==Swahili==. Thank you —Μετάknowledgediscuss/deeds 05:07, 19 July 2013 (UTC) - Only one page: vitabu DTLHS (talk) 05:40, 19 July 2013 (UTC)
- Wow, I guess I already found the rest. Thank you! —Μετάknowledgediscuss/deeds 14:06, 19 July 2013 (UTC)
- Another request: are there any pages that use
{{head|sw|plural}} ? (I would just set {{head}} to categorise them, but I think editing {{head}} should be kept to a minimum perhaps because the it would take so long to update the database... but if you disagree, I can do that instead.) —Μετάknowledgediscuss/deeds 23:40, 25 July 2013 (UTC) - magari, mabaki, matunda, mataifa, marefu, maandiko, miaridi, maafa (this is from the 11th, I can run it again when the next dump comes out if you care). DTLHS (talk) 23:49, 25 July 2013 (UTC)
- Asante sana ! It's not necessary to run it again, hardly anybody has edited Swahili for years until I started recently. —Μετάknowledgediscuss/deeds 23:51, 25 July 2013 (UTC)
- If you are familiar with pywikipediabot, you can make a script that fetches all words in a certain category, then a list of all pages that transclude a template, and subtract the two (set difference). That will give you all pages that don't transclude that template. —CodeCat 23:55, 25 July 2013 (UTC)
- I've tried using AWB to do it, but I don't know how to get virtual Windows for free, so I simply haven't enough time on computers running Windows. If you feel inspired to send me an email explaining what to do so I don't have to keep bothering other people, by all means do so, but keep in mind my ignorance :) —Μετάknowledgediscuss/deeds 00:28, 26 July 2013 (UTC)
 |