广告
加载中

刘强东甩出王炸 京东大模型一飞冲天

李松月 2026-06-05 13:52
李松月 2026/06/05 13:52

邦小白快读

EN
全文速览

本文核心信息是京东正式推出并开源JoyAI-Echo长音视频生成框架,攻克了AI行业长期难以解决的长视频生成难题,推动AI视频从演示级向生产级迈进。

1. 该框架针对AI长视频生成的三大痛点做了针对性优化:内置跨模态音视频记忆库,解决了长视频生成中角色形象、音色一致性差的问题,实测5分钟生成内容依然能保持高度统一;通过技术优化生成链路,仅DMD技术就让推理速度提升约7.5倍,大幅提高生成效率;加入对话式编辑功能,引入导演助理机制,支持局部修改内容,不用重新生成整条视频,降低修改成本。

2. 目前AI长视频仍存在算力成本高、复杂剧情稳定性不足、版权合规等问题,未来技术成熟后会广泛应用在电商、广告、短剧等多个领域,会进一步降低内容创作门槛,让普通创作者也能生产专业内容。

京东推出JoyAI-Echo长视频生成框架,叠加此前开放的AI数字人直播能力,给品牌营销、内容生产带来了新的机遇,也明确了新的消费趋势方向。

1. 当前电商已经进入内容驱动消费阶段,短视频、直播已经成为用户消费决策的核心入口,AI内容生产能力已经成为品牌营销的新竞争力,能帮助品牌大幅降低内容生产门槛,产出更多种草、直播内容,触达更多用户。

2. 京东已经向所有商家免费开放数字人直播能力,还配套开放公域流量,支持品牌搭建24小时全时段直播间,帮助品牌降本增效提升转化,未来JoyAI-Echo落地后还能进一步支持品牌生成更多长营销内容。

3. 需要注意的是,当前AI生成内容还存在版权归属、数据来源合规等问题,品牌布局AI内容营销时要提前做好风险规避,关注合规性要求。

AI长视频生成技术的加速成熟,给电商卖家带来了新的增长机会,同时也提示了需要注意的风险,值得卖家关注布局。

1. 机会层面,京东已经将数字人直播免费向所有卖家开放,还配套开放公域流量支持,卖家可以低成本搭建24小时无间歇直播间,不需要专人值班就能持续获客转化,大幅降低直播运营成本;本次推出的JoyAI-Echo未来嵌入商家后台后,卖家还能低成本生成专业的长营销内容,满足内容种草的需求,解决中小卖家找不到专业内容团队的痛点。

2. 风险提示层面,现阶段AI长视频生成还存在算力成本较高、复杂内容稳定性不足、版权数据合规性不明确等问题,卖家初期布局要控制试错成本,优先对接平台官方提供的合规AI工具,规避不必要的风险。

3. 当前内容电商已经成为行业主流,AI内容工具会成为卖家新的竞争力,卖家可以提前试用积累运营经验。

AI内容生产技术的突破,给工厂推进数字化转型、拓展电商渠道带来了新的启示和商业机会。

1. 商业机会层面,当前越来越多工厂做自有品牌电商,需要大量营销内容吸引消费者,但工厂普遍缺乏专业内容团队,内容生产成本很高,京东推出的AI内容工具能大幅降低内容生产门槛,工厂可以用这些工具低成本生成种草视频、运营24小时直播间,快速搭建自有品牌的内容营销体系,助力产品销。

2. 数字化转型启示层面,头部电商平台都在把AI内容工具做成平台基础设施,开放给所有商家使用,工厂做数字化和电商转型,可以依托平台现成的AI能力,补上内容营销、直播运营的短板,不需要自己投入大额成本研发工具,就能获得和大商家一样的内容能力,提升自身的零售竞争力。

3. 目前AI内容工业化正在加速,工厂可以提前对接平台的AI工具,摸索适合自身的内容运营模式,提前建立竞争优势。

当前AI长视频生成行业进入加速发展阶段,出现了明确的行业发展趋势,也暴露了客户的普遍痛点,给AI内容服务服务商指明了发展方向。

1. 行业发展趋势:AI视频已经成为互联网巨头的核心布局方向,行业已经从追求基础生成能力,转向追求工业化生产能力,核心要求是满足商业化生产的实际需求,竞争焦点集中在长视频一致性、生成效率、可编辑性几个核心维度。

2. 客户核心痛点:目前B端客户普遍痛点是原有AI视频工具只能生成短的演示级内容,无法满足生产需求,而且修改成本极高,中小客户内容生产预算有限,对降本的需求非常强烈。

3. 解决方案参考方向:可以参考京东的技术路径,通过跨模态记忆库解决时序一致性问题,优化生成链路提升推理速度,加入自然语言对话编辑功能支持局部修改,同时可以对接各大平台的生态,给客户提供一体化的AI内容生产解决方案,贴合客户的实际生产需求。

AI内容生产能力已经成为电商和内容平台的新核心竞争力,行业内已经出现了不少成熟的做法,也有需要规避的风险,供平台商参考。

1. 商家核心需求:当前平台上的商家普遍有降低内容生产、直播运营成本的需求,希望平台能提供低成本、易操作的AI内容生产工具,帮助商家产出更多内容,提升转化效果,这也成为平台吸引商家入驻的新卖点。

2. 行业可参考的最新做法:京东走的是赋能电商商家的路径,免费开放AI数字人直播给商家,配套开放公域流量帮商家转化,逐步完善AI内容生产全链路,把AI工具做成平台基础设施;字节侧重布局短剧全链路生产,阿里侧重构建完整影视创作工业流程,快手侧重深耕短视频直播,不同平台都结合自身生态做了差异化布局。

3. 需要规避的风险:现阶段AI长视频生成还存在算力成本高、内容稳定性不足、版权归属不清晰、内容真实性等问题,平台需要提前制定相关规则,控制合规风险,避免后续产生不必要的纠纷。

当前AI长视频生成领域出现了很多值得研究的产业新动向,也出现了不少待解决的新问题,不同平台的差异化商业模式也有研究价值。

1. 产业新动向:AI长视频生成已经成为全球互联网巨头的必争赛道,行业整体正从演示级向工业化生产级迈进,竞争焦点已经从单一的视频生成能力,转向全链路的工业化生产能力,核心围绕解决角色一致性、长时序逻辑、交互式编辑、商业化生成效率这些实际问题,整个内容工业化正进入全新阶段。

2. 待研究的新问题:当前产业发展还面临很多待解决的问题,包括技术层面的算力成本高、复杂剧情稳定性不足、细节控制能力有限,还有监管层面的训练数据来源、版权归属、内容真实性等合规问题,随着AI生成内容规模扩大,这些问题会越来越突出,需要深入研究对应的解决方案。

3. 商业模式研究方向:目前各大平台结合自身生态走出了不同的商业模式路径,京东侧重赋能电商商家降本增效,字节依托内容生态布局AI短剧生产,阿里尝试构建完整的影视工业化流程,这些差异化路径都值得深入研究,总结不同场景下的可复制经验。

返回默认

声明:快读内容全程由AI生成,请注意甄别信息。如您发现问题,请发送邮件至 run@ebrun.com 。

我是 品牌商 卖家 工厂 服务商 平台商 研究者 帮我再读一遍。

Quick Summary

This article covers JD.com’s official launch and open-source release of the JoyAI-Echo long audio-video generation framework, which solves the long-standing challenge of long-form video generation in the AI industry and pushes AI-generated video from the demonstration stage to production-grade application.

1. The framework delivers targeted optimizations to address three core pain points of AI long video generation: It integrates a cross-modal audio-video memory bank that maintains high consistency of character appearance and voice even across 5 minutes of generated content; technical optimizations to the generation pipeline, including the DMD module alone, boost inference speed by approximately 7.5x, greatly improving generation efficiency; and it adds conversational editing with a "director assistant" mechanism that supports local content modification, eliminating the need to regenerate the entire video and reducing revision costs.

2. AI long video generation still faces limitations including high computing costs, unstable output for complex plots, and copyright and compliance concerns. Once the technology matures, it will see widespread adoption across e-commerce, advertising, short dramas and other sectors, further lowering the barrier to content creation and enabling ordinary creators to produce professional-grade content.

JD.com’s launch of the JoyAI-Echo long video generation framework, paired with its previously opened AI digital human live streaming capability, creates new opportunities for brand marketing and content production, and clarifies emerging consumer trends.

1. E-commerce has entered an era of content-driven consumption, where short videos and live streams have become the core entry point for consumer purchasing decisions. AI content production capability has become a new core competitive advantage for brand marketing, helping brands drastically lower content creation barriers, produce more product-placement and live stream content, and reach a larger audience.

2. JD.com has opened its AI digital human live streaming capability to all merchants for free, and supports brands in building 24/7 live streaming rooms with matching public domain traffic, helping brands cut costs, improve efficiency and boost conversion. After JoyAI-Echo is fully deployed, it will further enable brands to produce more long-form marketing content.

3. It is important to note that AI-generated content still faces unresolved issues around copyright ownership and compliance of training data sources. Brands should proactively mitigate risks and prioritize compliance when building out AI-powered content marketing strategies.

The accelerating maturation of AI long video generation technology brings new growth opportunities for e-commerce sellers, while also highlighting key risks that make early strategic attention worthwhile.

1. On the opportunity side: JD.com has opened its AI digital human live streaming capability to all sellers for free, with matching public domain traffic support. Sellers can build low-cost 24/7 live streaming rooms that drive continuous customer acquisition and conversion without full-time on-site staff, drastically cutting live stream operation costs. Once JoyAI-Echo is integrated into merchant backend systems, sellers will also be able to produce professional long-form marketing content at low cost to meet product discovery needs, solving the pain point that small and medium-sized sellers often lack access to professional content teams.

2. On the risk side: At this stage, AI long video generation still faces challenges including high computing costs, unstable output for complex content, and unclear rules around copyright and data compliance. Sellers should control trial-and-error costs in early deployments, and prioritize using official compliance AI tools provided by platforms to avoid unnecessary risks.

3. Content-driven e-commerce is now the industry mainstream, and AI content tools will become a new core competitive advantage for sellers. Early trial and operational experience building is recommended.

Breakthroughs in AI content production technology bring new insights and business opportunities for factories pursuing digital transformation and expanding e-commerce channels.

1. On the business opportunity side: A growing number of factories are launching direct-to-consumer private brands on e-commerce platforms, which requires large volumes of marketing content to attract consumers. However, most factories lack in-house professional content teams, leading to high content production costs. JD.com’s AI content tools drastically lower the barrier to content creation, allowing factories to produce product discovery videos and run 24/7 live streaming rooms at low cost, quickly build out a content marketing system for their private brands, and boost product sales.

2. For digital transformation strategy: Leading e-commerce platforms are building AI content tools into open platform infrastructure available to all merchants. When pursuing digital and e-commerce transformation, factories can leverage existing platform AI capabilities to fill gaps in content marketing and live stream operation, without investing heavily in in-house tool development. This allows them to access the same content capabilities as large merchants and improve their retail competitiveness.

3. The industrialization of AI content is accelerating. Factories can proactively connect to platform AI tools to test and refine content operation models that fit their business, and build competitive advantages early.

The AI long video generation industry is entering a phase of accelerated development, with clear industry trends and widespread customer pain points that point to key development directions for AI content service providers.

1. Industry trends: AI video has become a core strategic focus for global internet giants. The industry has shifted from prioritizing basic generation capability to pursuing industrial-scale production capability, with core requirements tailored to actual commercial production needs. Competition is now centered on three key dimensions: long-video content consistency, generation efficiency, and editability.

2. Core customer pain points: Most B-end clients currently find that existing AI video tools can only produce short demonstration-grade content that cannot meet production needs, and revision costs are extremely high. Small and medium-sized clients have limited content production budgets, so demand for cost reduction is particularly strong.

3. Recommended solution directions: Providers can follow JD.com’s technical approach: using a cross-modal memory bank to resolve temporal content consistency, optimizing the generation pipeline to boost inference speed, and adding natural-language conversational editing to support local modifications. Service providers can also integrate with the ecosystems of major platforms to deliver end-to-end AI content production solutions that align with customers’ actual production needs.

AI content production capability has become a new core competitive advantage for e-commerce and content platforms. Industry players have already developed mature practices, while also identifying key risks to avoid, offering valuable lessons for platform operators.

1. Core merchant demands: Merchants on most platforms universally want to cut costs for content production and live stream operation, and expect platforms to provide low-cost, easy-to-use AI content tools that help them produce more content and improve conversion. This has also become a new key selling point for platforms to attract merchant onboarding.

2. Latest industry best practices: JD.com has taken a merchant enablement focused approach: it offers free AI digital human live streaming to merchants, provides matching public domain traffic to support conversion, and is gradually完善ing the full AI content production pipeline to build AI tools into core platform infrastructure. ByteDance focuses on building out the full production pipeline for AI-generated short dramas; Alibaba prioritizes building a complete industrial process for film and television creation; Kuaishou focuses on deepening capabilities for short-form video and live streaming. All players have built differentiated strategies aligned with their own ecosystems.

3. Key risks to avoid: At this stage, AI long video generation still faces issues including high computing costs, unstable content output, unclear copyright ownership, and content authenticity risks. Platforms should proactively establish clear rules to control compliance risks and avoid unnecessary disputes in the future.

The AI long video generation field has seen notable new industry developments, many unresolved open questions, and differentiated business models across platforms that all offer valuable research opportunities.

1. New industry developments: AI long video generation has become a highly contested strategic track for global internet giants. The industry as a whole is transitioning from demonstration-grade capability to industrial-scale production-grade capability. Competition has shifted from single-dimensional video generation capability to full-pipeline industrial production capability, centered on solving practical challenges including character consistency, long-sequence logic, interactive editing, and commercial generation efficiency. The industrialization of AI content is entering an entirely new stage.

2. Open research questions: The industry still faces many unresolved challenges, including technical issues such as high computing costs, unstable output for complex plots, and limited fine control over content details, as well as regulatory and compliance issues around training data sourcing, copyright ownership, and content authenticity. As the scale of AI-generated content expands, these issues will become increasingly prominent, creating demand for in-depth research into targeted solutions.

3. Business model research directions: Major platforms have developed distinct business model paths aligned with their own ecosystems: JD.com focuses on enabling e-commerce merchants to cut costs and improve efficiency; ByteDance builds out AI short drama production aligned with its existing content ecosystem; Alibaba is experimenting with building a complete industrial process for film and television production. All these differentiated paths are worthy of in-depth research to identify replicable best practices for different application scenarios.

Disclaimer: The "Quick Summary" content is entirely generated by AI. Please exercise discretion when interpreting the information. For issues or corrections, please email run@ebrun.com .

I am a Brand Seller Factory Service Provider Marketplace Seller Researcher Read it again.

刘强东:未来五年的技术进步,可能会超越过去十年的成就。

出品 | 电商头条 作者 |李松月

AI浪潮席卷而来,各类模型层出不穷,但视频生成始终是一块难啃的骨头。尤其是长视频,几乎很难一次性成功:不是角色动作出错,就是场景逻辑混乱。

这也让AI视频长期停留在“玩具”阶段,难以真正进入专业创作领域。

好在技术仍在不断突破,各大互联网公司也在持续攻坚。

日前,京东正式推出并开源JoyAI-Echo长音视频生成框架。相比此前行业里大量停留在“几秒钟短片”阶段的AI视频模型,JoyAI-Echo的核心突破,在于真正开始攻克“长视频生成”这一公认难题。

长期以来,AI生成长视频普遍面临三个关键问题:角色一致性容易崩坏、人物声音频繁变化,以及生成速度过慢,难以满足实际生产需求。而JoyAI-Echo正是围绕这三个问题进行了系统优化。

JoyAI-Echo内置了跨模态音视频记忆库,可以在多镜头生成过程中,持续记录并调用角色外观特征与音色信息,从而保证人物在长时间、多场景切换中的一致性。

经过实测,验证了在长达5分钟的视频生成过程中,角色身份、视觉形象以及声音音色依然能够保持高度统一。

这背后,本质上是在解决AI视频领域最棘手的“时序一致性”问题。

此前,大多数AI视频模型在生成短视频时表现尚可,但一旦时间拉长,就会出现人物面部变化、服装错乱、声音漂移甚至场景逻辑断裂等问题。

这也是为什么过去AI视频更多用于概念展示、实验短片,而难以真正进入工业化内容生产阶段。

JoyAI-Echo此次发布的意义,在于它开始让AI视频从“演示级”向“生产级”迈进,也标志着,京东在长视频生成领域进入全球第一梯队。

除了角色和声音一致性问题,JoyAI-Echo另一个重要突破就是生成效率。

京东团队提出了“记忆驱动后训练流程”,结合SFT、跨模态RLHF以及Distribution Matching Distillation(DMD)等技术,对生成链路进行了优化。其中,仅DMD技术就带来了约7.5倍的推理速度提升。

推理效率的提升,意味着AI视频开始具备更强的实时生产能力,也意味着商业化门槛正在下降。

JoyAI-Echo此次还加入了一个颇具代表性的功能:“对话式编辑”。

过去AI视频生成还有一个痛点,就是修改成本极高。用户如果对其中一个镜头不满意,往往需要重新生成整条视频。

但JoyAI-Echo引入了Director Agent(导演助理)机制,可以通过自然语言直接调整镜头、场景和角色内容,实现局部修改,而不必整体重跑。

这意味着AI视频正在从“静态生成工具”,逐渐演变为“动态协作工具”。

从产业层面来看,JoyAI-Echo的发布,对于京东自身体系也具有非常现实的意义。

当前电商行业已经进入“内容驱动消费”阶段。短视频、直播、种草内容,正在成为用户消费决策的重要入口。尤其是在抖音、快手等平台推动下,“短视频+直播”已经成为行业主流趋势。

而AI长视频生成能力一旦成熟,最先改变的就是电商内容生产逻辑。

这对于京东而言,不只是技术突破,更是平台能力升级。

因为京东本身拥有海量商家与商品生态。如果AI视频工具能够深度嵌入商家后台,那么它实际上会成为一种新的基础设施。

对于大量中小商家来说,AI生成内容意味着营销门槛下降;对于平台而言,则意味着内容供给能力大幅增加。

尤其是在直播电商与内容电商高度竞争的背景下,AI视频能力可能会逐渐成为平台的重要竞争力之一。

事实上,在AI技术赋能电商这件事上,京东已经布局多年。

除了此次推出的JoyAI-Echo之外,京东此前已经陆续发布了JoyAI基础大模型、JoyAI-RA具身智能模型、JoyInside、AI数字人以及AI智能体“京言”等多个AI方向产品。

其中,AI数字人与智能客服方向,已经较早进入实际业务场景。

早在2024年,京东就以集团创始人刘强东为原型,推出了“采销东哥”AI数字人,开启直播首秀。

开播仅30分钟,直播间观看量就突破千万;40分钟直播中,整体订单量破10万,整场成交额超过5000万元,用户平均停留时长达到日常均值的5.6倍。

去年12月,京东正式宣布京东数字人直播向所有商家免费开放,旨在帮助商家快速搭建全时段无间隙的24小时直播间。

同时,京东还全面开放公域流量,帮助商家实现降本增效,实现高效转化。

而就在两个月前,刘强东的数字人形象再次出现在三亚国际游艇分展区活动现场,并发表致辞,同时宣告刘强东的个人游艇品牌在三亚落地。

如今JoyAI-Echo的推出,也意味着京东开始进一步向AI内容生产链路延伸。

从整个行业来看,长视频生成赛道也正在迅速升温。

2026年初,字节跳动推出Seedance 2.0视频生成模型,被不少业内人士视为AI视频从“可用”迈向“生产级”的关键节点之一。

随后在2026年5月,火山引擎又正式上线“火山剧创1.0”,开始覆盖短剧创作全流程,包括剧本生成、镜头拆解以及视频生成等环节。

这背后反映出的,是字节对于“AI内容工业化”的明确布局。

因为字节本身拥有抖音、西瓜视频等庞大的内容生态,而短剧又是当前流量增长最快的内容形态之一。

AI视频能力如果能够降低短剧生产成本,将直接影响未来内容供给效率。

阿里同样在快速推进视频生成方向。

此前,HappyHorse曾匿名登顶图生视频榜单,随后阿里巴巴正式“认领”HappyHorse,并确认是由其旗下ATH(Alibaba Token Hub)创新事业部研发。

2026年5月,阿里云又推出AI视频创作平台“万镜一刻”,整合HappyHorse、Wan、Qwen-image、Z-image等多套模型能力。

值得注意的是,阿里的思路更偏向“完整创作链路”。

其平台不仅提供视频生成,还引入了“编剧Agent”“导演Agent”“提示词Agent”等功能模块。

例如,编剧Agent可以把一句话创意扩展成完整剧本,导演Agent负责拆解镜头,而提示词Agent则负责生成电影级运镜语言。

这种模式本质上是在尝试构建整个影视工业流程。

快手方面,则在2026年2月正式全球上线可灵3.0系列模型,包括可灵视频3.0、可灵视频3.0 Omni等产品。

由于快手本身长期深耕短视频与直播生态,因此其AI视频能力也被视为未来平台商业化的重要方向之一。

可以发现,AI视频生成技术,正成为几大互联网巨头的必争之地。竞争的焦点,也在从单一的“生成能力”转向更全面的“工业化能力”。

更要看解决角色一致性、长时序逻辑、交互式编辑的能力,以及能否支撑起商业化生产的高效率需求。

当然,现阶段AI长视频仍然存在不少限制。

包括算力成本较高、复杂剧情稳定性不足、细节控制能力有限,以及版权与数据合规等问题,依然是行业需要持续面对的挑战。

尤其是随着AI生成内容规模扩大,关于训练数据来源、版权归属以及内容真实性的问题,也会越来越受到重视。

但整体来看,AI视频行业已经进入明显加速阶段。

而长视频能力一旦成熟,它带来的影响将不仅仅局限于娱乐行业。

广告、电商、教育、游戏、短剧、品牌营销、虚拟主播、数字人直播,乃至未来的互动影视,都可能因此发生变化。

对于平台而言,AI视频意味着更低的内容生产成本、更高的内容供给效率以及更强的商业转化能力。

对于创作者而言,则意味着内容生产门槛进一步下降。

而对于整个互联网行业来说,这意味着“内容工业化”正在进入全新的阶段。

注:文/李松月,文章来源:电商头条(公众号ID:ecxinwen),本文为作者独立观点,不代表亿邦动力立场。

文章来源:电商头条

广告
微信
朋友圈

这么好看,分享一下?

朋友圈 分享

APP内打开

+1
+1
微信好友 朋友圈 新浪微博 QQ空间
关闭
收藏成功
发送
/140 0