{"id":15419,"date":"2024-08-06T11:37:16","date_gmt":"2024-08-06T09:37:16","guid":{"rendered":"http:\/\/costops.com\/index.php\/2024\/08\/06\/why-ais-tom-cruise-problem-means-it-is-doomed-to-fail\/"},"modified":"2024-08-06T11:37:16","modified_gmt":"2024-08-06T09:37:16","slug":"why-ais-tom-cruise-problem-means-it-is-doomed-to-fail","status":"publish","type":"post","link":"http:\/\/costops.com\/index.php\/2024\/08\/06\/why-ais-tom-cruise-problem-means-it-is-doomed-to-fail\/","title":{"rendered":"Why AI\u2019s Tom Cruise problem means it is \u2018doomed to fail\u2019"},"content":{"rendered":"<p>LLMs\u2019 \u2018reversal curse\u2019 leads it to fail at drawing relationships between simple facts. It\u2019s a problem that could prove fatal<\/p>\n<p>In 2021, linguist Emily Bender and computer scientist Timnit Gebru <a href=\"https:\/\/dl.acm.org\/doi\/pdf\/10.1145\/3442188.3445922\">published a paper<\/a> that described the then-nascent field of language models as one of \u201cstochastic parrots\u201d. A language model, they wrote, \u201cis a system for haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning.\u201d<\/p>\n<p>The phrase stuck. AI can still get better, even if it is a stochastic parrot, because the more training data it has, the better it will seem. But does something like ChatGPT actually display anything like intelligence, reasoning, or thought? Or is it simply, at ever-increasing scales, \u201chaphazardly stitching together sequences of linguistic forms\u201d?<\/p>\n<p><em>If a human learns the fact, \u201cValentina Tereshkova was the first woman to travel to space\u201d, they can also correctly answer, \u201cWho was the first woman to travel to space?\u201d This is such a basic form of generalization that it seems trivial. Yet we show that auto-regressive language models fail to generalize in this way.<\/em><\/p>\n<p><em>This is an instance of an ordering effect we call the Reversal Curse.<\/em><\/p>\n<p><em>We test GPT-4 on pairs of questions like, \u201cWho is Tom Cruise\u2019s mother?\u201d and, \u201cWho is Mary Lee Pfeiffer\u2019s son?\u201d for 1,000 different celebrities and their actual parents. We find many cases where a model answers the first question (\u201cWho is &lt;celebrity&gt;\u2019s parent?\u201d) correctly, but not the second. We hypothesize this is because the pretraining data includes fewer examples of the ordering where the parent precedes the celebrity (eg \u201cMary Lee Pfeiffer\u2019s son is Tom Cruise\u201d).<\/em><\/p>\n<p> <a href=\"https:\/\/www.theguardian.com\/technology\/article\/2024\/aug\/06\/ai-llms\">Continue reading&#8230;<\/a><br \/>\n<img src=\"https:\/\/i.guim.co.uk\/img\/media\/6e3466a665b47e451edfe4b54a9640328ec6f8ee\/0_0_8082_4849\/master\/8082.jpg?width=140&amp;quality=85&amp;auto=format&amp;fit=max&amp;s=f9099a9f9fe8227df78c90de2a9b4290\" title=\"Why AI\u2019s Tom Cruise problem means it is \u2018doomed to fail\u2019\" \/>LLMs\u2019 \u2018reversal curse\u2019 leads it to fail at drawing relationships between simple facts. It\u2019s a problem that could prove fatal<br \/>\nIn 2021, linguist Emily Bender and computer scientist Timnit Gebru published a paper that described the then-nascent field of language models as one of \u201cstochastic parrots\u201d. A language model, they wrote, \u201cis a system for haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning.\u201d<br \/>\nThe phrase stuck. AI can still get better, even if it is a stochastic parrot, because the more training data it has, the better it will seem. But does something like ChatGPT actually display anything like intelligence, reasoning, or thought? Or is it simply, at ever-increasing scales, \u201chaphazardly stitching together sequences of linguistic forms\u201d?<br \/>\nIf a human learns the fact, \u201cValentina Tereshkova was the first woman to travel to space\u201d, they can also correctly answer, \u201cWho was the first woman to travel to space?\u201d This is such a basic form of generalization that it seems trivial. Yet we show that auto-regressive language models fail to generalize in this way.<br \/>\nThis is an instance of an ordering effect we call the Reversal Curse.<br \/>\nWe test GPT-4 on pairs of questions like, \u201cWho is Tom Cruise\u2019s mother?\u201d and, \u201cWho is Mary Lee Pfeiffer\u2019s son?\u201d for 1,000 different celebrities and their actual parents. We find many cases where a model answers the first question (\u201cWho is &lt;celebrity&gt;\u2019s parent?\u201d) correctly, but not the second. We hypothesize this is because the pretraining data includes fewer examples of the ordering where the parent precedes the celebrity (eg \u201cMary Lee Pfeiffer\u2019s son is Tom Cruise\u201d). Continue reading&#8230;Technology | The Guardian<\/p>\n","protected":false},"excerpt":{"rendered":"<p>LLMs\u2019 \u2018reversal curse\u2019 leads it to fail at drawing relationships between simple facts. It\u2019s a problem that could prove fatal In 2021, linguist Emily Bender and computer scientist Timnit Gebru published a paper that described the then-nascent field of language models as one of \u201cstochastic parrots\u201d. A language model, they wrote, \u201cis a system for &hellip;<\/p>\n<p class=\"read-more\"> <a class=\"\" href=\"http:\/\/costops.com\/index.php\/2024\/08\/06\/why-ais-tom-cruise-problem-means-it-is-doomed-to-fail\/\"> <span class=\"screen-reader-text\">Why AI\u2019s Tom Cruise problem means it is \u2018doomed to fail\u2019<\/span> Read More &raquo;<\/a><\/p>\n","protected":false},"author":0,"featured_media":15420,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[1],"tags":[],"_links":{"self":[{"href":"http:\/\/costops.com\/index.php\/wp-json\/wp\/v2\/posts\/15419"}],"collection":[{"href":"http:\/\/costops.com\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/costops.com\/index.php\/wp-json\/wp\/v2\/types\/post"}],"replies":[{"embeddable":true,"href":"http:\/\/costops.com\/index.php\/wp-json\/wp\/v2\/comments?post=15419"}],"version-history":[{"count":0,"href":"http:\/\/costops.com\/index.php\/wp-json\/wp\/v2\/posts\/15419\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"http:\/\/costops.com\/index.php\/wp-json\/wp\/v2\/media\/15420"}],"wp:attachment":[{"href":"http:\/\/costops.com\/index.php\/wp-json\/wp\/v2\/media?parent=15419"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/costops.com\/index.php\/wp-json\/wp\/v2\/categories?post=15419"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/costops.com\/index.php\/wp-json\/wp\/v2\/tags?post=15419"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}