Talk:Statistical machine translation

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

The ref. for Weaver is all screwed-up. Is it 1949 or 1955? I don't have access to the reference.

Also it's a bit misleading to attribute SMT to Weaver and say Brown et al. "re-introduced it". As far as anybody knows, Weaver never actually developped any SMT system. The statistical models that Brown et al. introduced (and their combination with a language model) was never introduced before afaik. Also, there is a short paper in CL in 1990, vol. 16, and their longer, reference paper appeared in CL in 1993, vol. 19. However this work was developped and presented at the end of the 80s. Sunny house (talk) 19:42, 11 March 2008 (UTC)[reply]

'Intuitive' to apply Bayes' theorem?[edit]

"As a representation of the process by which a human being translates a passage from French to English, this equation is fanciful at best. One can hardly imagine someone rifling mentally through the list of all English passages computing the product of the a priori probability of the passage, Pr(e), and the conditional probability of the French passage given the English passage, Pr(f|e). Instead, there is an overwhelming intuitive appeal to the idea that a translator proceeds by first understanding the French, and then expressing in English the meaning that he has thus grasped. Many people have been guided by this intuitive picture when building machine translation systems." http://acl.ldc.upenn.edu/J/J93/J93-2003.pdf BeauPhenomene (talk) 07:20, 24 April 2014 (UTC)[reply]

"Data dilution"[edit]

I hereby claim that 173.13.56.41 is a PR guy for safaba and that his contributions regarding "Data dilution" are advertising or a way to make a "key issue" that safaba conveniently solves seem "well known".

The link http://www.machinetranslation.net/ or "a quick guide to machine translation" links to a site that very shortly explains that machine translation is somthing that corporations can use and that data dilution is a common issue. Luckily, safabar has a solution for this. As the site is held by safaba, and offers no objective or additional information, I call for the link to be removed.

The term "data dilution" seems to me as nothing more as "training on data from another/too general topic" which obiously leads to phrases-> translations from another topic. I have never encountered the term in a scientific environment and supect it to be coined by safaba (The first page of google results for "data dilution"+ machine+translation all lead to safaba or this wikipedia page). It seems highly missplaced among the other general problems. So it should be either generalized to "acquiring an adequate training corpus is hard" or stripped off any commercial links and terms. — Preceding unsigned comment added by 2A02:8071:2785:5200:28D4:27A2:EC64:ABAB (talk) 16:36, 20 June 2014 (UTC)[reply]

External links modified[edit]

Hello fellow Wikipedians,

I have just added archive links to one external link on Statistical machine translation. Please take a moment to review my edit. You may add {{cbignore}} after the link to keep me from modifying it, if I keep adding bad data, but formatting bugs should be reported instead. Alternatively, you can add {{nobots|deny=InternetArchiveBot}} to keep me off the page altogether, but should be used as a last resort. I made the following changes:

When you have finished reviewing my changes, please set the checked parameter below to true or failed to let others know (documentation at {{Sourcecheck}}).

This message was posted before February 2018. After February 2018, "External links modified" talk page sections are no longer generated or monitored by InternetArchiveBot. No special action is required regarding these talk page notices, other than regular verification using the archive tool instructions below. Editors have permission to delete these "External links modified" talk page sections if they want to de-clutter talk pages, but see the RfC before doing mass systematic removals. This message is updated dynamically through the template {{source check}} (last update: 18 January 2022).

  • If you have discovered URLs which were erroneously considered dead by the bot, you can report them with this tool.
  • If you found an error with any archives or the URLs themselves, you can fix them with this tool.

Cheers.—cyberbot IITalk to my owner:Online 23:07, 27 March 2016 (UTC)[reply]

As the translation systems are not able to store all native strings and their translations....[edit]

Thanks for your efforts in creating this concise review of a complex and controversial topic. I suggest that some informed editor add a reference and/or explanation for the phrase I've quoted in the subject line?

Also, this article is, IMO, too "primary source" there is a general lack of references. But especially in this fragment (subject line) and the following paragraph. For instance, the following phrase "but even this is not enough." First off, that could use some additional explanation - for instance enough of what? Enough to do an acceptable translation? To reduce storage? What? And then who says? This and many other parts of the article need references to primary sources.

Ronewolf (talk) 19:38, 4 March 2017 (UTC)[reply]