Thursday, March 1, 2012

Wiktionary - Recent changes [en]: User talk:Dan Polansky

Wiktionary - Recent changes [en]
Track the most recent changes to the wiki in this feed.
User talk:Dan Polansky
Mar 1st 2012, 11:51

Restricting translations to lemma:

← Older revision Revision as of 11:51, 1 March 2012
Line 239: Line 239:
 
Above, I have been working with languages that have the three genders of m, f, and n. What remains to be done is the same for languages with the two genders of m and f, such as--probably--Italian, French and Spanish. Using the same technique is likely to generate a significant number of false positives. The matching on all three genders almost always selects adjectival translations; a similar matching for only masculine and feminine would probably select many nouns, such as analogues of English "actor" and "actress" in these languages. To fix this, one would have to make sure that the translation being matched is within an adjective section, which is nowhere obviously possible using AWB regexp replacements. --[[User:Dan Polansky|Dan Polansky]] ([[User talk:Dan Polansky|talk]]) 10:44, 1 March 2012 (UTC)
 
Above, I have been working with languages that have the three genders of m, f, and n. What remains to be done is the same for languages with the two genders of m and f, such as--probably--Italian, French and Spanish. Using the same technique is likely to generate a significant number of false positives. The matching on all three genders almost always selects adjectival translations; a similar matching for only masculine and feminine would probably select many nouns, such as analogues of English "actor" and "actress" in these languages. To fix this, one would have to make sure that the translation being matched is within an adjective section, which is nowhere obviously possible using AWB regexp replacements. --[[User:Dan Polansky|Dan Polansky]] ([[User talk:Dan Polansky|talk]]) 10:44, 1 March 2012 (UTC)
 
: I have manually fixed the few items that were of the form m|p, f|p and n|p. --[[User:Dan Polansky|Dan Polansky]] ([[User talk:Dan Polansky|talk]]) 11:22, 1 March 2012 (UTC)
 
: I have manually fixed the few items that were of the form m|p, f|p and n|p. --[[User:Dan Polansky|Dan Polansky]] ([[User talk:Dan Polansky|talk]]) 11:22, 1 March 2012 (UTC)
  +
A further batch, heavily supervised with several skips and manual corrections, for the likes of {{<nowiki/>t|...|m|f}}, {{<nowiki/>t|...|n}}:
  +
* #edits: 32
  +
* Edit summary: restrict adjective translation to lemma
  +
* Rexexp table:
  +
<small>
  +
<pre>
  +
{{(t-?\+?)\|(..)\|(...)([^\|]*?)\|m\|f(\|?.*?)}}, {{t-?\+?\|\2\|\3([^\|]*?)\|n\|?.*?}}, {{t-?\+?\|\2\|\3([^\|]*?)\|p\|?.*?}} {{$1|$2|$3$4$5}} False True True False False False True
  +
{{(t-?\+?)\|(..)\|(...)([^\|]*?)\|m\|f(\|?.*?)}}, {{t-?\+?\|\2\|\3([^\|]*?)\|n\|?.*?}} {{$1|$2|$3$4$5}} False True True False False False True
  +
</pre>
  +
</small>
  +
--[[User:Dan Polansky|Dan Polansky]] ([[User talk:Dan Polansky|talk]]) 11:51, 1 March 2012 (UTC)

You are receiving this email because you subscribed to this feed at blogtrottr.com.

If you no longer wish to receive these emails, you can unsubscribe from this feed, or manage all your subscriptions