Site icon Advancing Business Horizons

Who is the target market for the DouBao AI smartphone?

Who is the target market for the DouBao AI smartphone?

On December 1, the DouBao team at ByteDance released a technical preview version of DouBao Phone Assistant.

According to reports, DouBao Phone Assistant is an AI assistant software developed in collaboration with mobile device manufacturers at the operating system level, based on the DouBao app. Leveraging the capabilities of DouBao’s large model and authorization from mobile manufacturers, DouBao Phone Assistant provides users with more convenient interactions and richer experiences.

At this stage, developers and tech enthusiasts can experience the technical preview version of DouBao Phone Assistant on the engineering prototype nubia M153, which was co-developed by DouBao and ZTE. This version has been made available to developers and tech enthusiasts in limited quantities, priced at 3,499 yuan.

The emergence of DouBao Phone Assistant represents an attempt to use AI Agents to bridge apps and reconstruct the interaction logic of mobile internet.

Although the current demonstrations still require disclaimers regarding technological ‘uncertainty,’ this deep integration into the operating system’s lower levels, aiming for ‘intent-driven services,’ may carry more revolutionary significance than standalone chatbots.

Perhaps whoever can first solve the stability challenges of ‘operating smartphones’ will define the ‘iPhone moment’ of the AI era.

Previously, according to a former hardware product manager at ZTE who spoke to ‘GeekPark,’ ByteDance and Nubia prepared an initial stock of 500,000 units for the phone’s first sale and ordered corresponding quantities of key smartphone components.

In the current smartphone market, mainstream flagship models from domestic brands typically have initial sales period stocks in the range of 2-3 million units. Therefore, although the figures for DouBao Phone cannot match those of leading smartphone manufacturers whose annual shipments exceed ten million units, its goal of moving beyond being a ‘geek toy’ to reach a broader user base is already quite clear.

A first-sale stockpile of 500,000 units, if fully deployed in the market, remains a figure significant enough to impact the industry. By comparison, Black Shark, once a top player in the niche gaming smartphone market, shipped 1-1.5 million units during 2022-2023.

1 From ‘Dialogue Box’ to ‘Action-Oriented’

Over the past two years, we have become accustomed to Chatbots that can write poems and draw pictures. However, for ordinary users, the most painful pain point on mobile phones is often the cumbersome operation flow. The highlight of DouBao Phone Assistant this time lies in its attempt to transition from ‘conversation’ to ‘action’.

In the technical preview demonstration, DouBao showcased a capability frequently mentioned in previous GUI Agent (Graphical User Interface Agent) research—it can ‘understand’ the screen like a human and directly simulate click operations.

This confidence in ‘understanding the screen’ and simulating human operations stems from DouBao’s large model’s accumulation in multimodal capabilities.

According to official revelations, the model’s performance in dimensions such as visual understanding, reasoning, and image creation is already among the top tier internationally. It is precisely because the model possesses accurate Graphical User Interface (GUI) recognition capabilities that it has achieved high scores in multiple authoritative evaluations, thereby understanding the meanings of ‘buttons’ and ‘input fields’ like a human, rather than merely recognizing a bunch of code.

According to the official user documentation of DouBao Phone, DouBao will automatically determine whether to invoke AI Agent capabilities based on user intent. If the user’s conversation begins with ‘Help me operate the phone,’ the task will be 100% completed through AI phone operation.

The more detailed the task description, the higher its execution efficiency and the better the execution results. For example: ‘Open Meituan Takeout and help me write positive reviews for my recent orders.’ Additionally, AI phone operation occurs on a virtual screen, does not expand by default in the foreground, and will not affect other ongoing tasks; you can return to the home screen and use other applications at any time.

Users can also converse directly with DouBao, stating their needs, and DouBao can automatically determine whether to complete the task by operating phone functions based on the requirements. Users can find the ‘Operate Phone’ function button at the bottom of the DouBao chat box, click the button to manually describe the requirements, or set conditions such as scheduled tasks.

Imagine this scenario: You are interested in a product on social media. In the past, you would need to take a screenshot, exit the app, open the e-commerce platform, search, and compare prices.

However, in DouBao’s demonstration, you only need to say ‘Help me compare prices across all platforms and place an order,’ and the AI will automatically jump between apps, search for the same product, compare prices and specifications, claim coupons, and even select the lowest-priced item for you and add it to the shopping cart.

Although, for security reasons, the payment process still requires manual confirmation, the series of mechanical clicks and switches beforehand have already been handled by the AI.

Even complex tasks can be executed. In the official demonstration of a travel planning scenario, when a user issues a multi-intent instruction such as ‘I’m going to Paris next month; help me mark the restaurants I’ve saved on the map, check which days have exhibitions, and book tickets,’ the AI can quickly break down the request into six subtasks: from querying social media bookmarks, marking locations on AutoNavi Maps, booking tickets on Ctrip, to finally organizing everything into a memo.

This cross-application, multi-step ‘task chain’ execution capability is one of the key milestones in AI’s transition from a ‘toy’ to a ‘tool.’

To achieve this kind of ‘human-like’ interaction, DouBao has integrated multiple system-level permissions.

At the system level, DouBao Phone offers various interaction methods for AI capabilities, allowing users to activate it via a side button, voice commands, or even through headphones. Within the photo gallery, it can directly understand and execute commands like ‘remove passersby from the image.’

In the more advanced ‘Pro Mode,’ it can also invoke system tools and leverage memory functions to complete complex tasks requiring multi-step reasoning, such as ‘recommending gifts and adding them to the shopping cart.’

Of course, entrusting screen control and personal preferences to AI invariably raises concerns about privacy and security. Therefore, the DouBao team emphasizes that this feature can be enabled on demand and commits to strictly protecting data privacy.

As a ‘technical preview version,’ the DouBao team also noted at the end of the video that due to the inherent uncertainties of large language model technology, the ‘seamless’ experience showcased in the demo cannot yet be fully replicated, and the product still falls short of the team’s ultimate expectations.

This reflects the current state of AI Agents: the direction is highly promising, but practical implementation still requires refinement over time.

The ‘Third Path’ Without Hardware Manufacturing

In the wave of AI-powered smartphones, two schools of thought have consistently existed: one represented by Google/Pixel phones, which develop proprietary models and an entire suite of AI software experiences embedded within their own systems; the other consists of pure software vendors attempting to seize entry points through super apps.

DouBao has chosen the third path: not to develop hardware but to focus on building an ecosystem.

At the same time as the preview release, DouBao explicitly stated that it has “no plans to develop its own smartphone.” Their strategy is highly pragmatic – by negotiating with multiple smartphone manufacturers, they aim to integrate DouBao’s large model capabilities into devices from different brands through “operating system-level cooperation.”

This kind of deep integration between “smartphone manufacturers and large model providers” is becoming a new trend in the industry.

Similar to the partnership between Google Gemini and Samsung, the idea that expertise should be focused on specific areas is gradually becoming a consensus.

For smartphone manufacturers, developing a top-tier model capable of advanced reasoning, visual comprehension, and complex task planning from scratch would be extremely costly. Meanwhile, for internet giants like ByteDance, the lack of a hardware platform means AI remains behind the ‘glass wall’ of an app, unable to access users’ most critical data and scenarios.

The current nubia M153 engineering prototype is just the beginning. Priced at 3,499 yuan, this device may primarily serve as an “invitation” to developers and tech enthusiasts, aiming to validate the technical feasibility and user feedback of such cross-industry collaboration.

In the AI era, simply creating an app is no longer sufficient.

The emergence of DouBao’s smartphone assistant may represent a fundamental rethinking of the interaction logic of mobile internet.

As large models become increasingly powerful, merely developing an app is no longer enough in the AI era.

AI agents need to take on more complex tasks, perceive richer contexts, and deliver tangible functionality to provide real-world value. This means they must move beyond the confines of software, integrate deeply with operating system-level permissions, and leverage hardware capabilities.

In the past, ByteDance has always been a powerful “air force” — possessing cutting-edge algorithms and an extensive application ecosystem. However, compared to Google, which owns Android, or Huawei, which has full-scenario terminals, ByteDance has consistently lacked a solid foothold in operating systems and terminal hardware.

In the mobile internet era, this might not have been an issue, but at a time when AI needs to deeply integrate into user scenarios, the lack of hardware carriers could mean losing the ability to perceive these scenarios.

The launch of DouBao Phone Assistant appears to be an exploratory move by ByteDance at this stage.

From Pico to Ola Friend, and now to an assistant integrated deeply within the phone OS layer, ByteDance is cautiously addressing its weakness in “hardware touchpoints.”

This may not represent the industry’s final form in the next two to three years, but one thing is certain: ByteDance has realized that in order to fully leverage AI, taking this crucial step towards combining software and hardware is essential.

Editor/Liam


link

Exit mobile version