5 features Windows 11's AI Copilot needs to truly be useful
AI has become a huge part of Microsoft’s strategy over the last year or so. Sure, that could be said for many companies, as AI seems to be the hottest buzzword right now. But Microsoft recently took the bold step of bringing it to Windows 11 in the form of Windows Copilot, a new AI assistant that’s (unofficially) a successor to Cortana. It’s powered by the GPT-4 large language model used n Bing Chat to create a tool that hopes to be more useful than ever.
While the potential is exciting, the current implementation of Copilot in Windows Insider builds is very basic and not too different from just using Bing Chat on the web. I have a few ideas for what Microsoft can do to make Copilot a truly essential part of Windows 11.
1. Support all system settings and major functions
The big draw with Windows Copilot at launch was that it could interact with your PC in ways that Bing Chat, as just a browser, really couldn’t. And this is a great starting point. Many power users already like using text-based interfaces and keyboard commands to perform certain tasks, so to be able to change a lot of these settings from one place has a lot of potential.
However, the current implementation only supports a few features, like being able to change to light or dark mode, turn on do not disturb, or take a screenshot. This really needs to expand to all the Windows settings that can be found in the Settings app (the Control Panel should probably be left behind at this point), or at least much more than what we have right now. It could change the accent color, disable one of the displays in a multi-monitor setup, change playback devices, and so on. It could even start a focus session.
If Windows Copilot could help with all of this, it could make interacting with certain settings a lot faster, especially for people who do that frequently. There’s plenty that can be done, and Microsoft has promised some of it, but it hasn’t delivered yet.
2. Easy app hooks
Another thing I feel will make or break the usefulness of Copilot is its integration with other apps. In the spirit of becoming a centralized AI assistant, Copilot needs to be able to integrate easily with all kinds of third-party apps. We kind of saw this in action when Microsoft demoed Copilot at this year’s Build and used it to play music with Spotify. But we haven’t seen that become functional yet, and it needs to go beyond Microsoft’s usual partners.
Copilot should be able to open apps and start a specific task within them, or play a specific show on Netflix, or anything else that might be useful in these apps.
3. Integration with File Explorer
I know people are probably sick and tired of Microsoft shoehorning unwanted services wherever it can, but I think having Copilot integration in File Explorer could make it useful for more than just users who want to type instructions in Copilot, especially because that requires having the Copilot window taking up space on your screen almost permanently.
It could be interesting to have a feature in File Explorer where you can right-click a file and choose an option called “Send to Copilot,” which would then prompt it to ask what you want to do with said file. For example, with an image, you could ask Copilot to remove the background, or you could ask it to transcribe an audio file. These are both capabilities we’ve seen shown off, but that would usually require you to drag and drop the file into the Copilot panel. I think being able to access it directly from File Explorer would be welcome (but the option to turn it off would be nice, too).
4. Voice control
I suspect I’m alone in this one, which is why I’ve pushed this one down a bit. Voice commands are obviously more popular on phones, and right now, Bing Chat only supports this feature if you’re on your phone. But I think it would be nice to have voice commands supported on laptops and PCs, too. Being able to change all these settings, start playing music, or open an app with your voice could be useful. And the same goes for asking Bing questions. Cortana did support it when Windows 11 was introduced, and Microsoft wanted that to be a big thing, so I’m not sure if it would actually gain much traction, but I can definitely see uses for it.
With a browser, you have to go a bit out of your way to initiate voice interaction, but with a built-in feature, you could have an activation keyword or shortcut, so you can quickly speak your commands. I imagine it wouldn’t be that hard to implement since voice recognition is already used in multiple parts of Windows.
5. Screen reading and OCR
One last thing I’d love to see Copilot be able to do is read the information on the screen and, especially, optical character recognition (OCR). Right now, you can ask Copilot to summarize a page, but it has to be on a website open in Edge, which severely limits the functionality. While the ideal approach would be to have this work with other browsers natively, it would be interesting if Copilot could basically act as a screen reader (such as Narrator) and use that to extract information from any page on any browser or app to summarize it.
Another cool feature would be OCR, or the ability to recognize text in images, so you could ask Copilot to grab text from an image or a PDF scan., and simply copy it or summarize it. This one is actually not that crazy since it has been reported that some kind of OCR capability is planned for Windows 12, so I’m sure we could see something along these lines.