DalvikOptimizationandVerificationWithdexopt

作者：孩子气zyj2 | 来源：互联网 | 2023-10-15 19:54

#标签：收藏本文转载自：http:www.netmite.comandroidmydroiddalvikdocsdexopt.htmlTheDalvikvirtualmachine

#标签：收藏

本文转载自：http://www.netmite.com/android/mydroid/dalvik/docs/dexopt.html

The Dalvik virtual machine was designed specifically for the Android mobile platform. The target systems have little RAM, store data on slow internal flash memory, and generally have the performance characteristics of decade-old desktop systems. They also run Linux, which provides virtual memory, processes and threads, and UID-based security mechanisms.

The features and limitations caused us to focus on certain goals:

Class data, notably bytecode, must be shared between multiple processes to minimize total system memory usage.
The overhead in launching a new app must be minimized to keep the device responsive.
Storing class data in individual files results in a lot of redundancy, especially with respect to strings. To conserve disk space we need to factor this out.
Parsing class data fields adds unnecessary overhead during class loading. Accessing data values (e.g. integers and strings) directly as C types is better.
Bytecode verification is necessary, but slow, so we want to verify as much as possible outside app execution.
Bytecode optimization (quickened instructions, method pruning) is important for speed and battery life.
For security reasons, processes may not edit shared code.

The typical VM implementation uncompresses individual classes from a compressed archive and stores them on the heap. This implies a separate copy of each class in every process, and slows application startup because the code must be uncompressed (or at least read off disk in many small pieces). On the other hand, having the bytecode on the local heap makes it easy to rewrite instructions on first use, facilitating a number of different optimizations.

The goals led us to make some fundamental decisions:

Multiple classes are aggregated into a single "DEX" file.
DEX files are mapped read-only and shared between processes.
Byte ordering and word alignment are adjusted to suit the local system.
Bytecode verification is mandatory for all classes, but we want to "pre-verify" whatever we can.
Optimizations that require rewriting bytecode must be done ahead of time.

The consequences of these decisions are explained in the following sections.

VM Operation

Application code is delivered to the system in a .jar or .apk file. These are really just .zip archives with some meta-data files added. The Dalvik DEX data file is always called classes.dex.

The bytecode cannot be memory-mapped and executed directly from the zip file, because the data is compressed and the start of the file is not guaranteed to be word-aligned. These problems could be addressed by storingclasses.dex without compression and padding out the zip file, but that would increase the size of the package sent across the data network.

We need to extract classes.dex from the zip archive before we can use it. While we have the file available, we might as well perform some of the other actions (realignment, optimization, verification) described earlier. This raises a new question however: who is responsible for doing this, and where do we keep the output?

Preparation

There are at least three different ways to create a "prepared" DEX file, sometimes known as "ODEX" (for Optimized DEX):

The VM does it "just in time". The output goes into a special dalvik-cache directory. This works on the desktop and engineering-only device builds where the permissions on the dalvik-cache directory are not restricted. On production devices, this is not allowed.
The system installer does it when an application is first added. It has the privileges required to write to dalvik-cache.
The build system does it ahead of time. The relevant jar / apk files are present, but the classes.dex is stripped out. The optimized DEX is stored next to the original zip archive, not in dalvik-cache, and is part of the system image.

The dalvik-cache directory is more accurately $ANDROID_DATA/data/dalvik-cache. The files inside it have names derived from the full path of the source DEX. On the device the directory is owned by system / system and has 0771 permissions, and the optimized DEX files stored there are owned by system and the application‘s group, with 0644 permissions. DRM-locked applications will use 640 permissions to prevent other user applications from examining them. The bottom line is that you can read your own DEX file and those of most other applications, but you cannot create, modify, or remove them.

Preparation of the DEX file for the "just in time" and "system installer" approaches proceeds in three steps:

First, the dalvik-cache file is created. This must be done in a process with appropriate privileges, so for the "system installer" case this is done within installd, which runs as root.

Second, the classes.dex entry is extracted from the the zip archive. A small amount of space is left at the start of the file for the ODEX header.

Third, the file is memory-mapped for easy access and tweaked for use on the current system. This includes byte-swapping and structure realigning, but no meaningful changes to the DEX file. We also do some basic structure checks, such as ensuring that file offsets and data indices fall within valid ranges.

The build system uses a hairy process that involves starting the emulator, forcing just-in-time optimization of all relevant DEX files, and then extracting the results from dalvik-cache. The reasons for doing this, rather than using a tool that runs on the desktop, will become more apparent when the optimizations are explained.

Once the code is byte-swapped and aligned, we‘re ready to go. We append some pre-computed data, fill in the ODEX header at the start of the file, and start executing. (The header is filled in last, so that we don‘t try to use a partial file.) If we‘re interested in verification and optimization, however, we need to insert a step after the initial prep.

dexopt

We want to verify and optimize all of the classes in the DEX file. The easiest and safest way to do this is to load all of the classes into the VM and run through them. Anything that fails to load is simply not verified or optimized. Unfortunately, this can cause allocation of some resources that are difficult to release (e.g. loading of native shared libraries), so we don‘t want to do it in the same virtual machine that we‘re running applications in.

The solution is to invoke a program called dexopt, which is really just a back door into the VM. It performs an abbreviated VM initialization, loads zero or more DEX files from the bootstrap class path, and then sets about verifying and optimizing whatever it can from the target DEX. On completion, the process exits, freeing all resources.

It is possible for multiple VMs to want the same DEX file at the same time. File locking is used to ensure that dexopt is only run once.

Verification

The bytecode verification process involves scanning through the instructions in every method in every class in a DEX file. The goal is to identify illegal instruction sequences so that we don‘t have to check for them at run time. Many of the computations involved are also necessary for "exact" garbage collection. See Dalvik Bytecode Verifier Notes for more information.

For performance reasons, the optimizer (described in the next section) assumes that the verifier has run successfully, and makes some potentially unsafe assumptions. By default, Dalvik insists upon verifying all classes, and only optimizes classes that have been verified. If you want to disable the verifier, you can use command-line flags to do so. See also Controlling the Embedded VM for instructions on controlling these features within the Android application framework.

Reporting of verification failures is a tricky issue. For example, calling a package-scope method on a class in a different package is illegal and will be caught by the verifier. We don‘t necessarily want to report it during verification though -- we actually want to throw an exception when the method call is attempted. Checking the access flags on every method call is expensive though. The Dalvik Bytecode Verifier Notes document addresses this issue.

Classes that have been verified successfully have a flag set in the ODEX. They will not be re-verified when loaded. The Linux access permissions are expected to prevent tampering; if you can get around those, installing faulty bytecode is far from the easiest line of attack. The ODEX file has a 32-bit checksum, but that‘s chiefly present as a quick check for corrupted data.

Optimization

Virtual machine interpreters typically perform certain optimizations the first time a piece of code is used. Constant pool references are replaced with pointers to internal data structures, operations that always succeed or always work a certain way are replaced with simpler forms. Some of these require information only available at runtime, others can be inferred statically when certain assumptions are made.

The Dalvik optimizer does the following:

For virtual method calls, replace the method index with a vtable index.
For instance field get/put, replace the field index with a byte offset. Also, merge the boolean / byte / char / short variants into a single 32-bit form (less code in the interpreter means more room in the CPU I-cache).
Replace a handful of high-volume calls, like String.length(), with "inline" replacements. This skips the usual method call overhead, directly switching from the interpreter to a native implementation.
Prune empty methods. The simplest example is Object., which does nothing, but must be called whenever any object is allocated. The instruction is replaced with a new version that acts as a no-op unless a debugger is attached.
Append pre-computed data. For example, the VM wants to have a hash table for lookups on class name. Instead of computing this when the DEX file is loaded, we can compute it now, saving heap space and computation time in every VM where the DEX is loaded.

All of the instruction modifications involve replacing the opcode with one not defined by the Dalvik specification. This allows us to freely mix optimized and unoptimized instructions. The set of optimized instructions, and their exact representation, is tied closely to the VM version.

Most of the optimizations are obvious "wins". The use of raw indices and offsets not only allows us to execute more quickly, we can also skip the initial symbolic resolution. Pre-computation eats up disk space, and so must be done in moderation.

There are a couple of potential sources of trouble with these optimizations. First, vtable indices and byte offsets are subject to change if the VM is updated. Second, if a superclass is in a different DEX, and that other DEX is updated, we need to ensure that our optimized indices and offsets are updated as well. A similar but more subtle problem emerges when user-defined class loaders are employed: the class we actually call may not be the one we expected to call.

These problems are addressed with dependency lists and some limitations on what can be optimized.

Dependencies and Limitations

The optimized DEX file includes a list of dependencies on other DEX files, plus the CRC-32 and modification date from the originating classes.dex zip file entry. The dependency list includes the full path to the dalvik-cache file, and the file‘s SHA-1 signature. The timestamps of files on the device are unreliable and not used. The dependency area also includes the VM version number.

An optimized DEX is dependent upon all of the DEX files in the bootstrap class path. DEX files that are part of the bootstrap class path depend upon the DEX files that appeared earlier. To ensure that nothing outside the dependent DEX files is available, dexopt only loads the bootstrap classes. References to classes in other DEX files fail, which causes class loading and/or verification to fail, and classes with external dependencies are simply not optimized.

This means that splitting code out into many separate DEX files has a disadvantage: virtual method calls and instance field lookups between non-boot DEX files can‘t be optimized. Because verification is pass/fail with class granularity, no method in a class that has any reliance on classes in external DEX files can be optimized. This may be a bit heavy-handed, but it‘s the only way to guarantee that nothing breaks when individual pieces are updated.

Another negative consequence: any change to a bootstrap DEX will result in rejection of all optimized DEX files. This makes it hard to keep system updates small.

Despite our caution, there is still a possibility that a class in a DEX file loaded by a user-defined class loader could ask for a bootstrap class (say, String) and be given a different class with the same name. If a class in the DEX file being processed has the same name as a class in the bootstrap DEX files, the class will be flagged as ambiguous and references to it will not be resolved during verification / optimization. The class linking code in the VM does additional checks to plug another hole; see the verbose description in the VM sources for details (vm/oo/Class.c).

If one of the dependencies is updated, we need to re-verify and re-optimize the DEX file. If we can do a just-in-time dexopt invocation, this is easy. If we have to rely on the installer daemon, or the DEX was shipped only in ODEX, then the VM has to reject the DEX.

The output of dexopt is byte-swapped and struct-aligned for the host, and contains indices and offsets that are highly VM-specific (both version-wise and platform-wise). For this reason it‘s tricky to write a version of dexopt that runs on the desktop but generates output suitable for a particular device. The safest way to invoke it is on the target device, or on an emulator for that device.

Generated DEX

Some languages and frameworks rely on the ability to generate bytecode and execute it. The rather heavy dexopt verification and optimization model doesn‘t work well with that.

We intend to support this in a future release, but the exact method is to be determined. We may allow individual classes to be added or whole DEX files; may allow Java bytecode or Dalvik bytecode in instructions; may perform the usual set of optimizations, or use a separate interpreter that performs on-first-use optimizations directly on the bytecode (which won‘t be mapped read-only, since it‘s locally defined).

推荐阅读

uri
QUIC协议：快速UDP互联网连接

QUIC（Quick UDP Internet Connections）是谷歌开发的一种旨在提高网络性能和安全性的传输层协议。它基于UDP，并结合了TLS级别的安全性，提供了更高效、更可靠的互联网通信方式。 ... [详细]

蜡笔小新 2024-12-28 12:33:18
php
深入理解 Oracle 存储函数：计算员工年收入

本文介绍如何使用 Oracle 存储函数查询特定员工的年收入。我们将详细解释存储函数的创建过程，并提供完整的代码示例。 ... [详细]

蜡笔小新 2024-12-28 09:49:42
php
Vue 2 中解决页面刷新和按钮跳转导致导航栏样式失效的问题

本文介绍了如何通过配置路由的 meta 字段，确保 Vue 2 项目中的导航栏在页面刷新或内部按钮跳转时，始终保持正确的 active 样式。具体实现方法包括设置路由的 meta 属性，并在 HTML 模板中动态绑定类名。 ... [详细]

蜡笔小新 2024-12-28 13:45:20
include
次小生成树问题的高效求解

本文探讨了如何通过最小生成树（MST）来计算严格次小生成树。在处理过程中，需特别注意所有边权重相等的情况，以避免错误。我们首先构建最小生成树，然后枚举每条非树边，检查其是否能形成更优的次小生成树。 ... [详细]

蜡笔小新 2024-12-28 13:42:43
php
2018回顾与2019展望

本文总结了2018年的关键成就，包括职业变动、购车、考取驾照等重要事件，并分享了读书、工作、家庭和朋友方面的感悟。同时，展望2019年，制定了健康、软实力提升和技术学习的具体目标。 ... [详细]

蜡笔小新 2024-12-28 09:10:26
php
四载相伴，与51CTO学院共成长

在计算机技术的学习道路上，51CTO学院以其专业性和专注度给我留下了深刻印象。从2012年接触计算机到2014年开始系统学习网络技术和安全领域，51CTO学院始终是我信赖的学习平台。 ... [详细]

蜡笔小新 2024-12-28 08:20:07
php
CSS 布局：液态三栏混合宽度布局

本文介绍了如何使用 CSS 实现液态的三栏布局，其中各栏具有不同的宽度设置。通过调整容器和内容区域的属性，可以实现灵活且响应式的网页设计。 ... [详细]

蜡笔小新 2024-12-28 02:40:28
shell
Linux 系统启动故障排除指南：MBR 和 GRUB 问题

本文详细介绍了 Linux 系统启动过程中常见的 MBR 扇区和 GRUB 引导程序故障及其解决方案，涵盖从备份、模拟故障到恢复的具体步骤。 ... [详细]

蜡笔小新 2024-12-27 20:40:29
php
通过类型和标签选择元素

本文介绍了如何使用jQuery根据元素的类型（如复选框）和标签名（如段落）来获取DOM对象。这有助于更高效地操作网页中的特定元素。 ... [详细]

蜡笔小新 2024-12-27 19:44:14
shell
新浪笔试题

1:有如下一段程序：packagea.b.c;publicclassTest{privatestaticinti0;publicintgetNext(){return ... [详细]

蜡笔小新 2024-12-27 19:32:17
php
深入理解Cookie与Session会话管理

本文详细介绍了如何通过HTTP响应和请求处理浏览器的Cookie信息，以及如何创建、设置和管理Cookie。同时探讨了会话跟踪技术中的Session机制，解释其原理及应用场景。 ... [详细]

蜡笔小新 2024-12-27 18:20:43
php
Xcode 中多行代码缩进技巧

本文介绍如何在 Xcode 中使用快捷键和菜单命令对多行代码进行缩进，包括右缩进和左缩进的具体操作方法。 ... [详细]

蜡笔小新 2024-12-27 17:52:34
shell
Linux 自动化安装脚本详解

本文介绍了一款用于自动化部署 Linux 服务的 Bash 脚本。该脚本不仅涵盖了基本的文件复制和目录创建，还处理了系统服务的配置和启动，确保在多种 Linux 发行版上都能顺利运行。 ... [详细]

蜡笔小新 2024-12-27 16:33:32
go
在Linux系统中配置并启动ActiveMQ

本文详细介绍了如何在Linux环境中安装和配置ActiveMQ，包括端口开放及防火墙设置。通过本文，您可以掌握完整的ActiveMQ部署流程，确保其在网络环境中正常运行。 ... [详细]

蜡笔小新 2024-12-27 14:38:54
ip
2023 ARM嵌入式系统全国技术巡讲

2023 ARM嵌入式系统全国技术巡讲旨在分享ARM公司在半导体知识产权(IP)领域的最新进展。作为全球领先的IP提供商，ARM在嵌入式处理器市场占据主导地位，其产品广泛应用于90%以上的嵌入式设备中。此次巡讲将邀请来自ARM、飞思卡尔以及华清远见教育集团的行业专家，共同探讨当前嵌入式系统的前沿技术和应用。 ... [详细]

蜡笔小新 2024-12-28 11:58:48

孩子气zyj2

这个家伙很懒，什么也没留下！

Tags | 热门标签

RankList | 热门文章