[翻译] GAZEploit：通过VR/MR设备中的注视估计进行远程击键推理攻击

发表于: 2024-9-13 09:10 2143

[翻译] GAZEploit：通过VR/MR设备中的注视估计进行远程击键推理攻击

_air

2024-9-13 09:10

2143

Apple Vision Pro’s Eye Tracking Exposed What People Type

Apple Vision Pro的眼动追踪暴露了人们输入的内容

原文链接：https://www.wired.com/story/apple-vision-pro-persona-eye-tracking-spy-typing/

The Vision Pro uses 3D avatars on calls and for streaming. These researchers used eye tracking to work out the passwords and PINs people typed with their avatars.
Vision Pro 在通话和流媒体中使用 3D 形象。这些研究人员利用眼动追踪技术，破解了人们用虚拟化身输入的密码和 PIN 码。

You can tell a lot about someone from their eyes. They can indicate how tired you are, the type of mood you’re in, and potentially provide clues about health problems. But your eyes could also leak more secretive information: your passwords, PINs, and messages you type.
从一个人的眼睛中可以读出很多信息。它们能反映你的疲惫程度、情绪状态，甚至可能透露健康问题的线索。但你的眼睛还可能泄露更隐秘的信息：你的密码、PIN 码以及你输入的信息。

Today, a group of six computer scientists are revealing a new attack against Apple’s Vision Pro mixed reality headset where exposed eye-tracking data allowed them to decipher what people entered on the device’s virtual keyboard. The attack, dubbed GAZEploit and shared exclusively with WIRED, allowed the researchers to successfully reconstruct passwords, PINs, and messages people typed with their eyes.
今天，一组六名计算机科学家揭示了一种针对苹果 Vision Pro混合现实头显的新攻击手段，其中暴露的眼动追踪数据使他们能够破译用户在设备虚拟键盘上输入的内容。这种攻击方式被称为 GAZEploit，并独家分享给 WIRED，研究人员成功重构了用户用眼睛输入的密码、PIN 码和信息。

“Based on the direction of the eye movement, the hacker can determine which key the victim is now typing,” says Hanqiu Wang, one of the leading researchers involved in the work. They identified the correct letters people typed in passwords 77 percent of the time within five guesses and 92 percent of the time in messages.
“根据眼球运动的方向，黑客可以判断受害者当前正在输入哪个键，”参与该研究的主要研究人员之一王汉秋表示。他们发现，在五次猜测内，正确识别人们输入密码中的字母的准确率为 77%，而在消息中则高达 92%。

To be clear, the researchers did not gain access to Apple’s headset to see what they were viewing. Instead, they worked out what people were typing by remotely analyzing the eye movements of a virtual avatar created by the Vision Pro. This avatar can be used in Zoom calls, Teams, Slack, Reddit, Tinder, Twitter, Skype, and FaceTime.
明确地说，研究人员并未获取苹果头显设备以查看用户所见内容。相反，他们通过远程分析由 Vision Pro 创建的虚拟化身的眼睛运动，推断出人们正在输入的内容。该虚拟化身可用于 Zoom 通话、Teams、Slack、Reddit、Tinder、Twitter、Skype 及 FaceTime 等平台。

The researchers alerted Apple to the vulnerability in April, and the company issued a patch to stop the potential for data to leak at the end of July. It is the first attack to exploit people’s “gaze” data in this way, the researchers say. The findings underline how people’s biometric data—information and measurements about your body—can expose sensitive information and be used as part of the burgeoning surveillance industry.
研究人员在 4 月份向苹果公司报告了这一漏洞，该公司在 7 月底发布补丁，以阻止数据泄露的可能性。研究人员表示，这是首次利用人们的“注视”数据进行攻击。这些发现突显了人们的生物识别数据——关于你身体的信息和测量——如何暴露敏感信息，并被用作日益壮大的监控行业的一部分。

Eye Spy 眼观六路

Your eyes are your mouse when using the Vision Pro. When typing, you look at a virtual keyboard that hovers around, and can be moved and resized. When you’re looking at the right letter, tapping two fingers together works as a click.
使用 Vision Pro 时，您的眼睛就是鼠标。打字时，您会看到一个悬浮在周围的虚拟键盘，可以移动和调整大小。当您注视正确的字母时，双指轻触即可实现点击。

What you do stays within the headset, but if you want to jump on a quick Zoom, FaceTime some friends, or livestream, you’ll likely end up using a Persona—the sort of ghostly 3D avatar the Vision Pro creates by scanning your face.
你所做的一切都停留在头戴设备内，但如果你想快速加入 Zoom 会议、与朋友进行 FaceTime 通话或直播，你可能会使用一个 Persona——Vision Pro 通过扫描你的面部创建的那种幽灵般的 3D 虚拟形象。

“These technologies … can inadvertently expose critical facial biometrics, including eye-tracking data, through video calls where the user’s virtual avatar mirrors their eye movements,” the researchers write in a preprint paper detailing their findings. Wang says the work relies on two biometrics that can be extracted from recordings of a Persona: the eye aspect ratio (EAR) and eye gaze estimation. (As well as Wang, the research was completed by Siqi Dai, Max Panoff, and Shuo Wang from the University of Florida, Haoqi Shan from blockchain security company CertiK, and Zihao Zhan from Texas Tech University.)
“这些技术……可能会在视频通话中无意间暴露关键的面部生物特征，包括眼动追踪数据，其中用户的虚拟化身会反映其眼球运动，”研究人员在一篇详细介绍其发现的预印本论文中写道。王表示，这项工作依赖于从 Persona 记录中提取的两种生物特征：眼睛纵横比（EAR）和眼睛注视估计。（除了王之外，该研究由佛罗里达大学的 Siqi Dai、Max Panoff 和 Shuo Wang，区块链安全公司 CertiK 的 Haoqi Shan，以及德克萨斯理工大学 Zihao Zhan 共同完成。）

The GAZEploit attack consists of two parts, says Zhan, one of the lead researchers. First, the researchers created a way to identify when someone wearing the Vision Pro is typing by analyzing the 3D avatar they are sharing. For this, they trained a recurrent neural network, a type of deep learning model, with recordings of 30 people’s avatars while they completed a variety of typing tasks.
GAZEploit 攻击由两部分组成，首席研究员之一 Zhan 表示。首先，研究人员开发了一种方法，通过分析用户共享的 3D 头像，识别出佩戴 Vision Pro 的人何时在打字。为此，他们训练了一个循环神经网络，一种深度学习模型，使用 30 人在完成各种打字任务时的头像记录。

When someone is typing using the Vision Pro, their gaze fixates on the key they are likely to press, the researchers say, before quickly moving to the next key. “When we are typing our gaze will show some regular patterns,” Zhan says.
研究人员表示，当某人使用 Vision Pro 打字时，他们的目光会先固定在可能按下的键上，然后迅速移动到下一个键。“当我们打字时，目光会显示出一些规律模式，”Zhan 说道。

Wang says these patterns are more common during typing than if someone is browsing a website or watching a video while wearing the headset. “During tasks like gaze typing, the frequency of your eye blinking decreases because you are more focused,” Wang says. In short: Looking at a QWERTY keyboard and moving between the letters is a pretty distinct behavior.
王表示，这些模式在打字时比浏览网站或戴着耳机观看视频时更为常见。“在进行凝视打字等任务时，眨眼频率会降低，因为你更加专注，”王解释道。简而言之：盯着 QWERTY 键盘并在字母间移动是一种相当独特的行为。

The second part of the research, Zhan explains, uses geometric calculations to work out where someone has positioned the keyboard and the size they’ve made it. “The only requirement is that as long as we get enough gaze information that can accurately recover the keyboard, then all following keystrokes can be detected.”
詹解释说，研究的第二部分利用几何计算来确定键盘的位置和大小。“唯一的要求是，只要我们获得足够的眼动信息，能够准确还原键盘，那么接下来的所有按键操作都可以被检测到。”

Combining these two elements, they were able to predict the keys someone was likely to be typing. In a series of lab tests, they didn’t have any knowledge of the victim’s typing habits, speed, or know where the keyboard was placed. However, the researchers could predict the correct letters typed, in a maximum of five guesses, with 92.1 percent accuracy in messages, 77 percent of the time for passwords, 73 percent of the time for PINs, and 86.1 percent of occasions for emails, URLs, and webpages. (On the first guess, the letters would be right between 35 and 59 percent of the time, depending on what kind of information they were trying to work out.) Duplicate letters and typos add extra challenges.
结合这两项要素，他们能够预测某人可能正在输入的按键。在一系列实验室测试中，研究人员对受害者的打字习惯、速度或键盘位置一无所知。然而，他们能在最多五次猜测中，以 92.1%的准确率预测出信息中的正确字母，密码的正确率为 77%，PIN 码为 73%，电子邮件、网址和网页的正确率为 86.1%。（在第一次猜测时，字母的正确率在 35%到 59%之间，具体取决于他们试图破解的信息类型。）重复字母和打字错误增加了额外的挑战。

“It’s very powerful to know where someone is looking,” says Alexandra Papoutsaki, an associate professor of computer science at Pomona College who has studied eye tracking for years and reviewed the GAZEploit research for WIRED.
“了解某人的视线落点非常有力，”波莫纳学院计算机科学副教授亚历山德拉·帕普萨基表示，她多年来研究眼动追踪，并为 WIRED 审阅了 GAZEploit 的研究。

Papoutsaki says the work stands out as it only relies on the video feed of someone’s Persona, making it a more “realistic” space for an attack to happen when compared to a hacker getting hands-on with someone’s headset and trying to access eye tracking data. “The fact that now someone, just by streaming their Persona, could expose potentially what they’re doing is where the vulnerability becomes a lot more critical,” Papoutsaki says.
Papoutsaki 表示，这项工作的突出之处在于它仅依赖于某人虚拟形象的视频流，与黑客实际接触某人的头戴设备并试图访问眼动追踪数据相比，这使得攻击发生的空间更具“现实性”。Papoutsaki 指出：“现在，仅仅通过直播他们的虚拟形象，某人就可能暴露他们正在做的事情，这种情况下，漏洞的严重性大大增加。”

While the attack was created in lab settings and hasn’t been used against anyone using Personas in the real world, the researchers say there are ways hackers could have abused the data leakage. They say, theoretically at least, a criminal could share a file with a victim during a Zoom call, resulting in them logging into, say, a Google or Microsoft account. The attacker could then record the Persona while their target logs in and use the attack method to recover their password and access their account.
尽管此次攻击是在实验室环境中创建的，并未在现实世界中针对使用 Personas 的任何人实施，但研究人员表示，黑客有可能滥用这种数据泄露。他们指出，理论上至少存在一种可能性：犯罪分子可以在 Zoom 通话中与受害者共享一个文件，导致受害者登录 Google 或 Microsoft 账户。攻击者随后可以记录目标登录时的 Persona，并利用攻击方法恢复其密码，进而访问其账户。

Quick Fixes 快速修复

The GAZEploit researchers reported their findings to Apple in April and subsequently sent the company their proof-of-concept code so the attack could be replicated. Apple fixed the flaw in a Vision Pro software update at the end of July, which stops the sharing of a Persona if someone is using the virtual keyboard.
GAZEploit 研究团队于四月向苹果公司报告了他们的发现，并随后向该公司提供了概念验证代码，以便复现该攻击。苹果在七月底通过 Vision Pro 软件更新修复了这一漏洞，阻止了在使用虚拟键盘时共享 Persona 的情况。

An Apple spokesperson confirmed the company fixed the vulnerability, saying it was addressed in VisionOS 1.3. The company’s software update notes do not mention the fix, but it is detailed in the company's security-specific note. The researchers say Apple assigned CVE-2024-40865 for the vulnerability and recommend people download the latest software updates.
一位苹果发言人证实该公司已修复该漏洞，称其在 VisionOS 1.3 中得到了解决。公司的软件更新说明并未提及此修复，但详细内容可在公司的安全专项说明中找到。研究人员表示，苹果为该漏洞分配了 CVE-2024-40865 编号，并建议用户下载最新的软件更新。

The research highlights how people’s personal data can be inadvertently leaked or exposed. In recent years, police have extracted fingerprints from photographs posted online and identified people by the way they walk in CCTV footage. Law enforcement have also started testing Vision Pros as part of their surveillance efforts.
研究突显了人们的个人数据如何不经意间被泄露或曝光。近年来，警方已从网上发布的照片中提取指纹，并通过监控录像中人们走路的方式识别身份。执法部门也开始将 Vision Pros 作为其监控工作的一部分进行测试。

These privacy and surveillance concerns are likely to become more pressing as wearable technology becomes smaller, cheaper, and able to capture more information about people. “As wearables like glasses, XR, and smartwatches become more integrated into everyday life, users often overlook how much information these devices can collect about their activities and intentions, and the associated privacy risks,” says Cheng Zhang, an assistant professor at Cornell University who also reviewed the Vision Pro research at WIRED’s request. (Zhang’s work has involved creating wearables to help interpret human behaviors.)
随着可穿戴技术变得更小、更便宜，并能捕捉更多关于人们的信息，这些隐私和监控问题可能会变得更加紧迫。“随着眼镜、XR 和智能手表等可穿戴设备更深入地融入日常生活，用户往往忽视了这些设备能收集多少关于他们活动和意图的信息，以及相关的隐私风险，”康奈尔大学的助理教授 Cheng Zhang 表示。Zhang 应 WIRED 的要求审查了 Vision Pro 研究。（Zhang 的工作涉及创建可穿戴设备以帮助解读人类行为。）

“This paper clearly demonstrates one specific risk with gaze typing, but it’s just the tip of the iceberg,” Zhang says. “While these technologies are developed for positive purposes and applications, we also need to be aware of the privacy implications and start taking measures to mitigate potential risks for the future generation of everyday wearables.”
“本文明确展示了一种凝视打字的具体风险，但这只是冰山一角，”张说。“尽管这些技术是为积极目的和应用而开发的，我们也需要意识到其隐私影响，并开始采取措施，为下一代日常可穿戴设备减轻潜在风险。”

Update 2:30 pm ET, September 12, 2024: Following publication, Apple directed WIRED to a security note where the Vision Pro fix is mentioned. We've updated the story to include this note.
更新：2024 年 9 月 12 日，东部时间下午 2:30：发布后，苹果公司向 WIRED 提供了安全说明，其中提及了 Vision Pro 的修复措施。我们已更新报道以包含此说明。

HackerNews 评论节选

评论链接：https://news.ycombinator.com/item?id=41520516

@eneralizations:

It'd be pretty cyberpunk if the mitigation to this is to have your eyes digitally obscured when typing in sensitive data.
如果解决这一问题的措施是在输入敏感数据时将你的眼睛数字模糊化，那将非常赛博朋克。

[培训]内核驱动高级班，冲击BAT一流互联网大厂工作，每周日13:00-18:00直播授课

最后于 2024-9-13 09:12 被_air编辑，原因：

收藏・0

免费・0

支持