A malicious proof-of-concept Amazon Echo Skill shows how attackers can abuse the Alexa virtual assistant to eavesdrop on consumers with smart devices – and automatically transcribe every word said.
Checkmarx researchers told Threatpost that they created a proof-of-concept Alexa Skill that abuses the virtual assistant’s built-in request capabilities. The rogue Skill begins with the initiation of an Alexa voice-command session that fails to terminate (stop listening) after the command is given. Next, any recorded audio is transcribed (if voices are captured) and a text transcript is sent to a hacker. Checkmarx said it brought its proof-of-concept attack to Amazon’s attention and that the company fixed a coding flaw that allowed the rogue Skill to capture prolonged audio on April 10.
“On default, Alexa ends the sessions after each duration… we were able to build in a feature that kept the session going [so Alexa would continue listening]. We also wanted to make sure that the user is not prompted and that Alexa is still listening without re-prompts,” Erez Yalon, manager of Application Security Research at Checkmarx, told Threatpost.
One challenge for researchers was the issue of the “reprompt” feature in Alexa. Reprompts are used by Alexa if the service keeps the session open after sending the response but the user does not say anything, so Alexa will ask the user to repeat the order. However, Checkmarx researchers were able to replace the reprompt feature with empty reprompts, so that a listening cycle starts without letting the user know.
Finally, researchers accurately transcribed the voice received by skills: “In order to be able to listen and transcribe any arbitrary text, we had to do two tricks. First, we added a new slot-type, which captures any singleword, not limited to a closed list of words. Second, in order to capture sentences at almost any length, we had to build a formatted string for each possible length,” according to the report.
One big issue Checkmarx faced is that on Echo devices a shining blue reveals when Alexa listens. But, “the whole point of Alexa is that unlike a smartphone or tablet, you do not have to look at it to operate it,” said Yalon. “They are made to be placed in a corner where users simply speak to it without actively looking to its direction. And with Alexa voice services, vendor are embedding Alexa capabilities into their products and those products might not provide visual indication when the session is running.”
Amazon resolved this issue through tweaking several features on April 10, said Checkmarx. Researchers said Amazon fixed the problem by applying specific criteria to identify and reject eavesdropping skills during certification, detect empty reprompts and detect longer-than-usual sessions.
According to Checkmarx researcher Yalon, every “skill” needs to go through a certification process and be approved by Amazon before it can be published to the Amazon store.
“Checkmarx did not try to publicly release the malicious skill… If we did, Amazon would need to approve it. We do not know the timeline of Amazon’s certification process, but we have no reason to believe (including after discussions with Amazon) that our malicious skill would not have been approved prior to the recent mitigations,” said Yalon.
Amazon did not respond to a request from Threatpost for further comment.
The incident raises questions about the privacy risks around voice services such as Alexa, as well as other connected devices in the home.
In September, researchers devised a proof of concept that gives potentially harmful instructions to popular voice assistants like Siri, Google, Cortana, and Alexa using ultrasonic frequencies instead of voice commands. And in November, security firm Armis disclosed that Amazon Echo and Google Home devices are vulnerable to attacks through the over-the-air BlueBorne Bluetooth vulnerability.