DalvikOptimizationandVerificationWithdexopt

作者：孩子气zyj2 | 来源：互联网 | 2023-10-15 19:54

#标签：收藏本文转载自：http:www.netmite.comandroidmydroiddalvikdocsdexopt.htmlTheDalvikvirtualmachine

#标签：收藏

本文转载自：http://www.netmite.com/android/mydroid/dalvik/docs/dexopt.html

The Dalvik virtual machine was designed specifically for the Android mobile platform. The target systems have little RAM, store data on slow internal flash memory, and generally have the performance characteristics of decade-old desktop systems. They also run Linux, which provides virtual memory, processes and threads, and UID-based security mechanisms.

The features and limitations caused us to focus on certain goals:

Class data, notably bytecode, must be shared between multiple processes to minimize total system memory usage.
The overhead in launching a new app must be minimized to keep the device responsive.
Storing class data in individual files results in a lot of redundancy, especially with respect to strings. To conserve disk space we need to factor this out.
Parsing class data fields adds unnecessary overhead during class loading. Accessing data values (e.g. integers and strings) directly as C types is better.
Bytecode verification is necessary, but slow, so we want to verify as much as possible outside app execution.
Bytecode optimization (quickened instructions, method pruning) is important for speed and battery life.
For security reasons, processes may not edit shared code.

The typical VM implementation uncompresses individual classes from a compressed archive and stores them on the heap. This implies a separate copy of each class in every process, and slows application startup because the code must be uncompressed (or at least read off disk in many small pieces). On the other hand, having the bytecode on the local heap makes it easy to rewrite instructions on first use, facilitating a number of different optimizations.

The goals led us to make some fundamental decisions:

Multiple classes are aggregated into a single "DEX" file.
DEX files are mapped read-only and shared between processes.
Byte ordering and word alignment are adjusted to suit the local system.
Bytecode verification is mandatory for all classes, but we want to "pre-verify" whatever we can.
Optimizations that require rewriting bytecode must be done ahead of time.

The consequences of these decisions are explained in the following sections.

VM Operation

Application code is delivered to the system in a .jar or .apk file. These are really just .zip archives with some meta-data files added. The Dalvik DEX data file is always called classes.dex.

The bytecode cannot be memory-mapped and executed directly from the zip file, because the data is compressed and the start of the file is not guaranteed to be word-aligned. These problems could be addressed by storingclasses.dex without compression and padding out the zip file, but that would increase the size of the package sent across the data network.

We need to extract classes.dex from the zip archive before we can use it. While we have the file available, we might as well perform some of the other actions (realignment, optimization, verification) described earlier. This raises a new question however: who is responsible for doing this, and where do we keep the output?

Preparation

There are at least three different ways to create a "prepared" DEX file, sometimes known as "ODEX" (for Optimized DEX):

The VM does it "just in time". The output goes into a special dalvik-cache directory. This works on the desktop and engineering-only device builds where the permissions on the dalvik-cache directory are not restricted. On production devices, this is not allowed.
The system installer does it when an application is first added. It has the privileges required to write to dalvik-cache.
The build system does it ahead of time. The relevant jar / apk files are present, but the classes.dex is stripped out. The optimized DEX is stored next to the original zip archive, not in dalvik-cache, and is part of the system image.

The dalvik-cache directory is more accurately $ANDROID_DATA/data/dalvik-cache. The files inside it have names derived from the full path of the source DEX. On the device the directory is owned by system / system and has 0771 permissions, and the optimized DEX files stored there are owned by system and the application‘s group, with 0644 permissions. DRM-locked applications will use 640 permissions to prevent other user applications from examining them. The bottom line is that you can read your own DEX file and those of most other applications, but you cannot create, modify, or remove them.

Preparation of the DEX file for the "just in time" and "system installer" approaches proceeds in three steps:

First, the dalvik-cache file is created. This must be done in a process with appropriate privileges, so for the "system installer" case this is done within installd, which runs as root.

Second, the classes.dex entry is extracted from the the zip archive. A small amount of space is left at the start of the file for the ODEX header.

Third, the file is memory-mapped for easy access and tweaked for use on the current system. This includes byte-swapping and structure realigning, but no meaningful changes to the DEX file. We also do some basic structure checks, such as ensuring that file offsets and data indices fall within valid ranges.

The build system uses a hairy process that involves starting the emulator, forcing just-in-time optimization of all relevant DEX files, and then extracting the results from dalvik-cache. The reasons for doing this, rather than using a tool that runs on the desktop, will become more apparent when the optimizations are explained.

Once the code is byte-swapped and aligned, we‘re ready to go. We append some pre-computed data, fill in the ODEX header at the start of the file, and start executing. (The header is filled in last, so that we don‘t try to use a partial file.) If we‘re interested in verification and optimization, however, we need to insert a step after the initial prep.

dexopt

We want to verify and optimize all of the classes in the DEX file. The easiest and safest way to do this is to load all of the classes into the VM and run through them. Anything that fails to load is simply not verified or optimized. Unfortunately, this can cause allocation of some resources that are difficult to release (e.g. loading of native shared libraries), so we don‘t want to do it in the same virtual machine that we‘re running applications in.

The solution is to invoke a program called dexopt, which is really just a back door into the VM. It performs an abbreviated VM initialization, loads zero or more DEX files from the bootstrap class path, and then sets about verifying and optimizing whatever it can from the target DEX. On completion, the process exits, freeing all resources.

It is possible for multiple VMs to want the same DEX file at the same time. File locking is used to ensure that dexopt is only run once.

Verification

The bytecode verification process involves scanning through the instructions in every method in every class in a DEX file. The goal is to identify illegal instruction sequences so that we don‘t have to check for them at run time. Many of the computations involved are also necessary for "exact" garbage collection. See Dalvik Bytecode Verifier Notes for more information.

For performance reasons, the optimizer (described in the next section) assumes that the verifier has run successfully, and makes some potentially unsafe assumptions. By default, Dalvik insists upon verifying all classes, and only optimizes classes that have been verified. If you want to disable the verifier, you can use command-line flags to do so. See also Controlling the Embedded VM for instructions on controlling these features within the Android application framework.

Reporting of verification failures is a tricky issue. For example, calling a package-scope method on a class in a different package is illegal and will be caught by the verifier. We don‘t necessarily want to report it during verification though -- we actually want to throw an exception when the method call is attempted. Checking the access flags on every method call is expensive though. The Dalvik Bytecode Verifier Notes document addresses this issue.

Classes that have been verified successfully have a flag set in the ODEX. They will not be re-verified when loaded. The Linux access permissions are expected to prevent tampering; if you can get around those, installing faulty bytecode is far from the easiest line of attack. The ODEX file has a 32-bit checksum, but that‘s chiefly present as a quick check for corrupted data.

Optimization

Virtual machine interpreters typically perform certain optimizations the first time a piece of code is used. Constant pool references are replaced with pointers to internal data structures, operations that always succeed or always work a certain way are replaced with simpler forms. Some of these require information only available at runtime, others can be inferred statically when certain assumptions are made.

The Dalvik optimizer does the following:

For virtual method calls, replace the method index with a vtable index.
For instance field get/put, replace the field index with a byte offset. Also, merge the boolean / byte / char / short variants into a single 32-bit form (less code in the interpreter means more room in the CPU I-cache).
Replace a handful of high-volume calls, like String.length(), with "inline" replacements. This skips the usual method call overhead, directly switching from the interpreter to a native implementation.
Prune empty methods. The simplest example is Object., which does nothing, but must be called whenever any object is allocated. The instruction is replaced with a new version that acts as a no-op unless a debugger is attached.
Append pre-computed data. For example, the VM wants to have a hash table for lookups on class name. Instead of computing this when the DEX file is loaded, we can compute it now, saving heap space and computation time in every VM where the DEX is loaded.

All of the instruction modifications involve replacing the opcode with one not defined by the Dalvik specification. This allows us to freely mix optimized and unoptimized instructions. The set of optimized instructions, and their exact representation, is tied closely to the VM version.

Most of the optimizations are obvious "wins". The use of raw indices and offsets not only allows us to execute more quickly, we can also skip the initial symbolic resolution. Pre-computation eats up disk space, and so must be done in moderation.

There are a couple of potential sources of trouble with these optimizations. First, vtable indices and byte offsets are subject to change if the VM is updated. Second, if a superclass is in a different DEX, and that other DEX is updated, we need to ensure that our optimized indices and offsets are updated as well. A similar but more subtle problem emerges when user-defined class loaders are employed: the class we actually call may not be the one we expected to call.

These problems are addressed with dependency lists and some limitations on what can be optimized.

Dependencies and Limitations

The optimized DEX file includes a list of dependencies on other DEX files, plus the CRC-32 and modification date from the originating classes.dex zip file entry. The dependency list includes the full path to the dalvik-cache file, and the file‘s SHA-1 signature. The timestamps of files on the device are unreliable and not used. The dependency area also includes the VM version number.

An optimized DEX is dependent upon all of the DEX files in the bootstrap class path. DEX files that are part of the bootstrap class path depend upon the DEX files that appeared earlier. To ensure that nothing outside the dependent DEX files is available, dexopt only loads the bootstrap classes. References to classes in other DEX files fail, which causes class loading and/or verification to fail, and classes with external dependencies are simply not optimized.

This means that splitting code out into many separate DEX files has a disadvantage: virtual method calls and instance field lookups between non-boot DEX files can‘t be optimized. Because verification is pass/fail with class granularity, no method in a class that has any reliance on classes in external DEX files can be optimized. This may be a bit heavy-handed, but it‘s the only way to guarantee that nothing breaks when individual pieces are updated.

Another negative consequence: any change to a bootstrap DEX will result in rejection of all optimized DEX files. This makes it hard to keep system updates small.

Despite our caution, there is still a possibility that a class in a DEX file loaded by a user-defined class loader could ask for a bootstrap class (say, String) and be given a different class with the same name. If a class in the DEX file being processed has the same name as a class in the bootstrap DEX files, the class will be flagged as ambiguous and references to it will not be resolved during verification / optimization. The class linking code in the VM does additional checks to plug another hole; see the verbose description in the VM sources for details (vm/oo/Class.c).

If one of the dependencies is updated, we need to re-verify and re-optimize the DEX file. If we can do a just-in-time dexopt invocation, this is easy. If we have to rely on the installer daemon, or the DEX was shipped only in ODEX, then the VM has to reject the DEX.

The output of dexopt is byte-swapped and struct-aligned for the host, and contains indices and offsets that are highly VM-specific (both version-wise and platform-wise). For this reason it‘s tricky to write a version of dexopt that runs on the desktop but generates output suitable for a particular device. The safest way to invoke it is on the target device, or on an emulator for that device.

Generated DEX

Some languages and frameworks rely on the ability to generate bytecode and execute it. The rather heavy dexopt verification and optimization model doesn‘t work well with that.

We intend to support this in a future release, but the exact method is to be determined. We may allow individual classes to be added or whole DEX files; may allow Java bytecode or Dalvik bytecode in instructions; may perform the usual set of optimizations, or use a separate interpreter that performs on-first-use optimizations directly on the bytecode (which won‘t be mapped read-only, since it‘s locally defined).

推荐阅读

less
利用Selenium与ChromeDriver实现豆瓣网页全屏截图

本文介绍了一种使用Selenium和ChromeDriver结合Python代码，轻松实现对豆瓣网站进行完整页面截图的方法。该方法不仅简单易行，而且解决了新版Selenium不再支持PhantomJS的问题。 ... [详细]

蜡笔小新 2024-12-22 15:17:55
less
Appium + Java 自动化测试中处理页面空白区域点击问题

在进行移动应用自动化测试时，有时会遇到某些页面没有返回按钮，只能通过点击空白区域返回的情况。本文将探讨如何在Appium + Java环境中有效解决此类问题，并提供详细的解决方案。 ... [详细]

蜡笔小新 2024-12-22 17:30:25
char
紫荆花之恋：动态树上的小精灵友谊问题

本题来自WC2014，题目编号为BZOJ3435、洛谷P3920和UOJ55。该问题描述了一棵不断生长的带权树及其节点上小精灵之间的友谊关系，要求实时计算每次新增节点后树上所有可能的朋友对数。 ... [详细]

蜡笔小新 2024-12-22 14:36:54
usb
嵌入式开发环境搭建与文件传输指南

本文详细介绍了如何为嵌入式应用开发搭建必要的软硬件环境，并提供了通过串口和网线两种方式将文件传输到开发板的具体步骤。适合Linux开发初学者参考。 ... [详细]

蜡笔小新 2024-12-22 13:38:48
sum
HDU 1536: S-Nim 游戏中的 SG 博弈分析

探讨 HDU 1536 题目，即 S-Nim 游戏的博弈策略。通过 SG 函数分析游戏胜负的关键，并介绍如何编程实现解决方案。 ... [详细]

蜡笔小新 2024-12-21 18:26:33
instance
深入解析动态代理模式：23种设计模式之三

在设计模式中，动态代理模式是应用最为广泛的一种代理模式。它允许我们在运行时动态创建代理对象，并在调用方法时进行增强处理。本文将详细介绍动态代理的实现机制及其应用场景。 ... [详细]

蜡笔小新 2024-12-21 15:46:52
sum
深入理解JavaScript的作用域链与闭包

本文详细探讨了JavaScript中的作用域链和闭包机制，解释了它们的工作原理及其在实际编程中的应用。通过具体的代码示例，帮助读者更好地理解和掌握这些概念。 ... [详细]

蜡笔小新 2024-12-23 01:27:41
cmd
Windows 7 64位系统下Redis的安装与PHP Redis扩展配置

本文详细介绍了在Windows 7 64位操作系统中安装Redis以及配置PHP Redis扩展的方法，包括下载、安装和基本使用步骤。适合对Redis和PHP集成感兴趣的开发人员参考。 ... [详细]

蜡笔小新 2024-12-22 23:56:09
cmd
Python 内存管理机制详解

本文深入探讨了Python的内存管理机制，涵盖了垃圾回收、引用计数和内存池机制。通过具体示例和专业解释，帮助读者理解Python如何高效地管理和释放内存资源。 ... [详细]

蜡笔小新 2024-12-22 19:27:56
settings
如何清除Chrome浏览器地址栏的特定历史记录

在使用Chrome浏览器时，你可能会发现地址栏保存了大量浏览记录。有时你可能希望删除某些特定的历史记录而不影响其他数据。本文将详细介绍如何单独删除地址栏中的特定记录以及批量清除所有历史记录的方法。 ... [详细]

蜡笔小新 2024-12-22 17:14:01
settings
优化DB2数据库性能的关键策略

本文详细介绍了优化DB2数据库性能的多种方法，涵盖统计信息更新、缓冲池调整、日志缓冲区配置、应用程序堆大小设置、排序堆参数调整、代理程序管理、锁机制优化、活动应用程序限制、页清除程序配置、I/O服务器数量设定以及编入组提交数调整等方面。通过这些技术手段，可以显著提升数据库的运行效率和响应速度。 ... [详细]

蜡笔小新 2024-12-22 16:20:33
settings
探索新一代API文档工具，告别Swagger的繁琐

对于后端开发者而言，编写和维护API文档既繁琐又不可或缺。本文将介绍一款全新的API文档工具，帮助团队更高效地协作，简化API文档生成流程。 ... [详细]

蜡笔小新 2024-12-22 11:02:41
settings
优化App数据结构设计

本文探讨了在构建应用程序时，如何对不同类型的数据进行结构化设计。主要分为三类：全局配置、用户个人设置和用户关系链。每种类型的数据都有其独特的用途和应用场景，合理规划这些数据结构有助于提升用户体验和系统的可维护性。 ... [详细]

蜡笔小新 2024-12-22 09:42:30
settings
Linux中的yum安装软件

yum俗称大黄狗作用：解决安装软件包的依赖关系当安装依赖关系的软件包时，会将依赖的软件包一起安装。本地yum：需要yum源，光驱挂载。yum源：（刚开始查看yum源中的内容就是上图 ... [详细]

蜡笔小新 2024-12-22 07:41:00
settings
气象对比分析

本文探讨了不同地区和时间段的天气模式，通过详细的图表和数据分析，揭示了气候变化的趋势及其对环境和社会的影响。 ... [详细]

蜡笔小新 2024-12-21 19:39:55

孩子气zyj2

这个家伙很懒，什么也没留下！

Tags | 热门标签

RankList | 热门文章