Amazon Alexa nears our Patent

Wes Boudville
5 min readSep 23, 2020

Recently Amazon announced key extensions to Alexa to let it run apps based on a user Jane saying a verbal command to Alexa. Here, Alexa is running on her mobile phone. The apps have been configured to be run via deep links. Each deep link has the id of the app in a mobile app store, and a second input argument.

However in all publicly declared cases, Alexa appears to run on on the same device (phone) that has the app store. This greatly simplifies the entire system. When Jane speaks to Alexa, and Alexa finds an answer with a deep link that runs an app, the deep link is on the same machine as the app store. The deep link has an id of an app in the app store that may need to be installed on the phone if the app is not already present, which is the general case.

There has to be some non-trivial coding to use the deep link in the Alexa memory by installing the app and then going automatically to a page in the app. (The id of the page is the other part of the deep link.) But at the business level, you can see that this should somehow be possible. Alexa for Apps is the result.

Kudos to Amazon for enabling this, though Apple and Google are likely not far behind. The problem is a different configuration where Alexa runs on a physical digital assistant device, like the Echo or Spot. Near Alexa is our Jane, holding her phone. She interacts with her phone and also with Alexa. See Figure 1. To try to make this discussion as broad as possible, Figure 1 uses Ann instead of Alexa. So that a future Siri or Google Assistant might work in this way.

Ann is connected by wired means to a server in the cloud. The server can access an AI engine and large databases (DBs).

Figure 1 — Jane is near digital assistant Ann

With the digital assistant Ann being a physical device, the fundamental issue is the air gap between Ann and Jane’s phone. Jane can come arbitrarily close to Ann, in the context of Ann being a device in Jane’s home. Or even if Jane is visiting a friend’s place and he has device Ann in it.

Currently Jane and Ann interact vocally. Jane speaks a command when she is in earshot of Ann, who finds an answer and speaks it to Jane. Jane’s phone can also emit audio. So one natural step is to have Jane using an app on her phone say audio to Ann.

Similarly Ann can emit audio that encodes a deep link. The audio is meant for Jane’s phone, not Jane. Her phone runs decoding software that unpacks the deep link and runs it. This is how a deep link found by Ann gets to Jane’s phone. There are known ways for Ann to encode the link into audio, and the latter can be made euphonic, like a birdsong chirp. This can involve Ann being able to emit high frequency sounds, in contrast to her emitting text 2 speech (TTS), since speech is at lower frequency ranges. But most Anns can play music; that is a selling point of the assistants.

Another way to cross the air gap is if Ann has a (small) screen. She can show a barcode, like a QR or Data Matrix code. Jane uses her phone to scan and decode the barcode. For common codes like QR, there are decoding apps freely available, and some phones already have the decoders in the operating system. At present, most Anns to not have a suitable screen. But future Anns can.

Figure 2 shows the latest Echo Show (Sept 2020). It replaces the cylinder of previous Echoes with a spherical form and most notably adds a swiveling screen. Clearly this screen can show a barcode for a user to scan with her phone. And Apple and Google might likewise make future consumer devices with such screens.

Figure 2- Amazon Echo Show

The Echo Show also comes with Amazon Sidewalk, which is a wireless WiFi-like way to communicate with other devices. And the Show also has ZigBee.

Yet another way is for Ann to emplace a programmable RFID tag. She codes a deep link into it. Or a Bitly URL that encodes the deep link. If Jane has an RFID detector, she can scan Ann’s tag. Most current RFID tags are read only but programmable ones are starting to appear. Eventually they should fall in price.

Similarly, Ann could have an NFC transmitter. Jane with an NFC receiver on her phone can get the data.

Thus for a hardware digital assistant, the air gap between it and a nearby phone can be crossed by numerous means.

Figure 2 shows that we should not be limited by current digital assistant devices in imagining what they could do. Apple and Google could come up with radically different forms and functions just as the 2020 Echo Show differed greatly from its predecessor.

Figure 1 also shows Tim with his phone. In general he is at a different location than Jane. Jane might be a gamer who wants to play other gamers. She asks Ann for links (which might be deep links) to gamers. Ann could sort these by proximity to Jane, because closeness means lower latency, which is crucial for twitch (reflex or First Person Shooter) games. Ann sends the deep links to Jane by any of the above methods. Jane picks 1 (for Tim), which can load the game if it is not already on her phone, and then connect directly to the network address of Tim. They can now play.

Or she might want to watch a gamer play computer opponents (esports). Tim might be this person. In this case, many people can watch. The deep link Ann sends has these properties. It refers to a read only version of the game. When it is installed and run, it connects to the game server that Tim is using. But Jane’s app can show her what Tim sees on his game screen; however she cannot throw his spears or shoot his guns. Read only. This scenario also cuts out Twitch, the pre-eminent game watching platform. The read only game is made by the firm that makes the game. The firm monetises Jane like how Twitch monetises its users, by showing ads and selling subscriptions to omit ads.

Jane might want to interact with Tim in many other types of 2 person apps. Tim could be a tutor and Jane is a student. Or she might be interested in a dating app.

Using device Ann to deliver these types of information to Jane does not mean that Jane cannot get the data solely from her phone. It improves the value of the digital assistant as a physical device by letting it act as a competing channel.

We received a 2020 US patent for this, Digital assistant interacting with mobile devices.

--

--

Wes Boudville

Inventor. 23 granted US patents on AR/VR/Metaverse . Founded linket.info for mobile brands for users. Linket competes against Twitch and YouTube. PhD physics.