top of page
作家相片Gao Zhe

人工智能项目失败的4个原因


今年6月,新西兰连锁超市Pak'nSave发布了Savey Meal-Bot,这是一款人工智能工具,允许用户上传他们拥有的食材清单,然后机器人会提出他们可以尝试的食谱。它被宣传为购物者省钱的一种方式,因为新西兰人每年扔掉大约1500新西兰元的食物。


尽管有警告说用户必须超过18岁,但没有人查看食谱,只有食物可以输入聊天机器人,人们变得不守规矩,到了8月,该工具的各种错误在全球范围内传播开来。例如,Meal-Bot建议一名用户制作“漂白浸泡大米惊喜”作为“令人惊讶的烹饪冒险”这已经够糟糕了,但它的“芳香水混合物”只是致命氯气的配方,尽管Meal-Bot将其描述为“解渴和提神的完美无酒精饮料。”“神秘炖肉”包括500克切碎的人肉。Meal-bot将其描述为“一道美味可口、令人舒适的菜肴,它神奇的味道会让你大吃一惊。”


没有报告显示消费者因食用这些食谱而中毒,该工具已经更新,因此用户只能从有限的完全可食用的成分中选择。但它仍然会产生令人倒胃口的组合。


另一场备受瞩目的公关灾难降临到了律师事务所Levidow,Levidow &奥伯曼,P.C .的头上,他们的两名律师在使用ChatGPT撰写论点后,提交了充满虚假引语和引文的法律意见书。一名法官在6月份的裁决中说,该公司及其律师“在提交不存在的司法意见时放弃了自己的责任,然后在司法命令质疑这些虚假意见的存在后,继续袖手旁观这些虚假意见”,该裁决还处以5000美元的罚款。


普华永道最近与许多公司合作,帮助他们启动gen AI项目。但是它并不是一切都顺利,尽管围绕这项技术大肆宣传。


普华永道(PricewaterhouseCoopers)gen AI走向市场战略的合伙人兼负责人布雷特格林斯坦(Bret Greenstein)表示:“生成式人工智能只是比传统人工智能或机器学习影响更深远,所以灾难的机会增加了。”



Lack of governance

One problem that can happen with gen AI is when projects are rolled out with insufficient governance or oversight. While Pak’nSave’s Savey Meal-Bot was a public example of this, many more companies are making similar mistakes internally.

For example, Greenstein says he’s been working with a mid-sized financial institution that recently implemented generative AI five months ago using a private cloud instance of a commercial AI tool.

“Then they opened up the API to let their business users build their own applications,” he says. One of the first things they built was an HR chatbot, which provided benefits recommendations that unnecessarily exposed them to massive liability. For example, if the HR tool recommended the wrong option, an employee could miss the benefits window for an entire year. People would get upset, but they assumed because it was authoritative, it was actually accurate.

Greenstein doesn’t recommend companies open up APIs and just let people build whatever they want. There has to be a thoughtful, disciplined approach with some governance. “There are disciplined ways to build generative AI that assess for accuracy, manage bias, and deal with hallucinations — and you need a human in the loop to make sure it’s recommending the right things,” he adds.

The company had the chatbot running for a month and the feedback wasn’t good, so, fortunately, it was caught early enough not to seriously affect employees, but it did shake confidence in the leadership. On the flipside, if the company overcorrects and scales back on gen AI, it could miss a window of opportunity where competitors jump in and go faster.

In fact, according to an AI Infrastructure Alliance (AIIA) survey released in July of more than 1,000 senior executives at large enterprises, 54% say they incurred losses due to their failures to govern AI or ML applications, and 63% said their losses were $50 million or higher.


Ballooning costs

The most popular gen AI chatbots are free to the public. With a little experimentation, it’s cheap and easy to find applications that can provide business benefits, creating a false perception of value. And when enterprises set up pilot projects in tightly controlled environments, it’s also easy to underestimate the costs that will arise when the project is broadly deployed.

The same is true when a company uses an external vendor on the project, says Rob Lee, chief curriculum director and faculty lead at the SANS Institute, because nobody has experience yet with deploying gen AI at scale.

“They don’t have the calluses yet,” he says. “If you’ve done this before, and can accurately predict costs, you’re in high demand right now.”

For example, if AI is deployed via the cloud, then every API call adds up, and usage is going to be hard to predict. “You can’t estimate human behavior based on what the old system was,” he says. “Nobody knows the human behavior that gen AI will generate.”

Then there are transitional costs, he says. If, for instance, you need to buy a new house, you have to sell your current house, but if the old house doesn’t sell as quickly as expected, you might be stuck having to pay for two houses at the same time. The same holds true in IT, he says. “Are we going to be able to afford it if the transition takes longer than we thought?” With gen AI, since the technology is so new, nobody can predict that accurately.

“Then you get to the size of the data set,” he adds. “I have to pay for the storage, and for the calls to that storage. And for some applications, you have to have multi-deployed storage worldwide, as well as backups.”

According to the AIIA survey, cost was the second-biggest obstacle to gen AI adoption for large enterprises.


Unrealistic expectations

Because of all the hype around gen AI, some business leaders can start to see it as a magic bullet. All the public discussions about AI coming to life aren’t helping, says Amol Ajgaonkar, CTO of product innovation at Insight, the Arizona-based solution integrator. “Some of that is seeping into the decision making,” he says.

For example, over the summer, a global electronics manufacturer and distributor based in the western US wanted to build a system for content generation, specifically to create price documents for customers. “They have more than 8,000 client-facing sales executives who manage tens of thousands of accounts,” he says. “Pricing of products and services is a perpetual need to create statements of work for new projects. Content generation is a simple use case for generative AI.”

But the company thought the AI could look at historic data, find relevant examples from the past, and then apply it to new customer requests.

“The expectation was that the generative AI would just figure it out,” Ajgaonkar says. “I give it historical pricing, it will take a look at it, and then tell me what the pricing will be for similar stuff.”

Trying to explain to the company how generative AI actually worked, though, was a constant struggle, he says.

“All the stuff they read was pushing back on us,” he says. “Their idea of effort was miniscule, and the business value was great. The hype says how easy it is. But that’s not how it works.”

That kind of thinking sets up a company for disappointment and project failure, and perhaps even disillusionment in the benefits of AI in general.

The solution, Ajgaonkar says, is to break down the project into small steps and analyze the best way to accomplish each one. Often, generative AI will not be a good fit. For example, searching through historical documents to find relevant cases can be done more efficiently with traditional approaches, he says, although summarizing documents is something generative AI is good at.

Meanwhile, advanced analytics and ML models should be applied to predict the future, and figuring out how to assemble all the parts into a single proposal is best handled with business logic that can specify which services should be included. There’s also the mathematical calculations. It’s not only overkill but also incredibly inaccurate to try to use gen AI to do simple math.

“We can write a plugin to do the calculations,” says Ajgaonkar. “We don’t rely on the generative AI to calculate stuff.”

Then it’s time to assemble the final document. Some sections come from the legal team and never change. “That’s the boilerplate stuff,” he says. “And with the executive summary, the generative AI can put that in.”

In the end, the electronics company was able to get a solution that significantly cut down the time needed to write the statements of work, he says. But it took a bit of education to get to that point. Without the education, the project would’ve been a great disappointment.

Another thing that companies often don’t understand is that writing a gen AI prompt is not like giving instructions to a fellow adult human, he adds.

“It’s like giving my teenage kids instructions,” Ajgaonkar says. “Sometimes you have to repeat yourself so it sticks. Sometimes, the AI listens, and other times it won’t follow instructions. It’s almost like a different language. When you’re operationalizing something, understanding these minor things is a huge part of the success of the project.”

There are ways to improve the quality of responses, too, such as tree of thought reasoning and similar prompting methods, but these require multiple prompts to refine the response.

“Those are okay when you’re just doing research,” he says. “But when you’re actually running in production, you’re thinking about costs. Every word you push in is counted against your quota. How many tokens you consume will determine the cost of the platform.” Plus, there’s the time it takes to answer each question.

“For every request, if you have to use the tree of thought approach and ask for explanations, that will get very expensive,” he says. “If I was given a blank check, I would run the same prompt a thousand times in different variations to get exactly the result I want. But is it needed for the value it’s adding? That’s the balance that has to be struck when you’re building the solution.”


Data issues

Carm Taglienti, distinguished engineer at Insight, also recently ran into a project where unrealistic expectations nearly sank an AI project.

“AI project failure is 99% about expectations,” he says. “It’s not about the failure of the technology but the expectation of what people believe the technology can do.”

In this particular case, a client, a large US-based chip fabrication company, wanted to use AI to fix its supply-chain management issues. Not only did the company expect the AI to do things it couldn’t do, they expected things to work on the first try. But each time a project moves from one phase to another, there’s a good chance the first approach won’t work, so adjustments need to be made. Each of those points is an opportunity for a company to give up on an AI project, he says. But in this particular case, there was also a technical issue — a lack of good data.

In the past, when a particular chip or component wasn’t available, the company used a labor-intensive, manual process to find a replacement.

“But this wasn’t agile enough for them to support their business,” he says. Some of this process could be replaced by decision trees and expert systems, but these were fragile. If anything changed in the industry, the entire decision tree would need to be updated. Using AI, however, required a large amount of clean data. But the kind of exhaustive searches for components that would make for training data were rare.

“You don’t do a competitive analysis every time,” says Taglienti, and the chip manufacturer stuck with a preferred list of suppliers and top backups, only rarely doing large-scale supplier reviews.

The other problem was when the data was available, it was in difficult-to-process form. “If you’re a manufacturer, you create specifications,” he says. “But it wasn’t in a format you could ingest quickly.”

Then there are the more nuanced issues, like where the manufacturer had its facilities, and its reputation for timely deliveries.

“I have to do things like scrape the web and look at their 10-K if they’re a publicly traded company,” says Taglienti. “There’s a lot more to it than just saying I found a part that works.” This kind of analysis was possible to automate even before gen AI came along, he says, but it’s a much more complex process than people might assume at the start. And this isn’t unusual. The lack of usable data has long been a problem for AI and ML projects. Also in the AIIA survey, data issues were a significant challenge for 84% of companies deploying gen AI. PwC’s Greenstein, for instance, recently worked with a consumer company that wanted to launch a project to automate back-office processing.

“They had their AI services set up,” he says. “Their cloud was set up. Their people were ready. But they didn’t anticipate how hard it was to get access to the data.” One data source required API licenses the company didn’t have, so it would need to go through a procurement process to get them, which can take months.

“In another system, the access controls were at a very high level by organization,” he says. “A third system was user-based controls. For gen AI, they had to reconcile all those, but they couldn’t do it quickly.”

In the long term, the company would get all the data it needs, he says — but they would have lost months.

“In this case, they pivoted to other use cases,” Greenstein says. “But leadership lost time and enthusiasm. All the people who were excited about the potential productivity improvements were frustrated, as well as the IT teams who hadn’t considered the data stuff, leading to leadership losing confidence in them.”

He says companies should prioritize potential AI use cases first by impact, second by risk, and third by data: “Do we have the data to do this use case? Do we have permission to use it? Is it accessible? Is it clean enough to be useful?,” he asks. “If we don’t get past this step, we don’t start. We find another use case.”



















2 次查看0 則留言

Comments


bottom of page