热门标签 | HotTags
当前位置:  开发笔记 > 编程语言 > 正文

DalvikOptimizationandVerificationWithdexopt

#标签:收藏本文转载自:http:www.netmite.comandroidmydroiddalvikdocsdexopt.htmlTheDalvikvirtualmachine

#标签: 收藏

本文转载自:http://www.netmite.com/android/mydroid/dalvik/docs/dexopt.html

The Dalvik virtual machine was designed specifically for the Android mobile platform. The target systems have little RAM, store data on slow internal flash memory, and generally have the performance characteristics of decade-old desktop systems. They also run Linux, which provides virtual memory, processes and threads, and UID-based security mechanisms.

The features and limitations caused us to focus on certain goals:

  • Class data, notably bytecode, must be shared between multiple processes to minimize total system memory usage.

  • The overhead in launching a new app must be minimized to keep the device responsive.

  • Storing class data in individual files results in a lot of redundancy, especially with respect to strings. To conserve disk space we need to factor this out.

  • Parsing class data fields adds unnecessary overhead during class loading. Accessing data values (e.g. integers and strings) directly as C types is better.

  • Bytecode verification is necessary, but slow, so we want to verify as much as possible outside app execution.

  • Bytecode optimization (quickened instructions, method pruning) is important for speed and battery life.

  • For security reasons, processes may not edit shared code.

The typical VM implementation uncompresses individual classes from a compressed archive and stores them on the heap. This implies a separate copy of each class in every process, and slows application startup because the code must be uncompressed (or at least read off disk in many small pieces). On the other hand, having the bytecode on the local heap makes it easy to rewrite instructions on first use, facilitating a number of different optimizations.

The goals led us to make some fundamental decisions:

  • Multiple classes are aggregated into a single "DEX" file.

  • DEX files are mapped read-only and shared between processes.

  • Byte ordering and word alignment are adjusted to suit the local system.

  • Bytecode verification is mandatory for all classes, but we want to "pre-verify" whatever we can.

  • Optimizations that require rewriting bytecode must be done ahead of time.

The consequences of these decisions are explained in the following sections.

VM Operation

Application code is delivered to the system in a .jar or .apk file. These are really just .zip archives with some meta-data files added. The Dalvik DEX data file is always called classes.dex.

The bytecode cannot be memory-mapped and executed directly from the zip file, because the data is compressed and the start of the file is not guaranteed to be word-aligned. These problems could be addressed by storingclasses.dex without compression and padding out the zip file, but that would increase the size of the package sent across the data network.

We need to extract classes.dex from the zip archive before we can use it. While we have the file available, we might as well perform some of the other actions (realignment, optimization, verification) described earlier. This raises a new question however: who is responsible for doing this, and where do we keep the output?

Preparation

There are at least three different ways to create a "prepared" DEX file, sometimes known as "ODEX" (for Optimized DEX):

  1. The VM does it "just in time". The output goes into a special dalvik-cache directory. This works on the desktop and engineering-only device builds where the permissions on the dalvik-cache directory are not restricted. On production devices, this is not allowed.

  2. The system installer does it when an application is first added. It has the privileges required to write to dalvik-cache.

  3. The build system does it ahead of time. The relevant jar / apk files are present, but the classes.dex is stripped out. The optimized DEX is stored next to the original zip archive, not in dalvik-cache, and is part of the system image.

The dalvik-cache directory is more accurately $ANDROID_DATA/data/dalvik-cache. The files inside it have names derived from the full path of the source DEX. On the device the directory is owned by system / system and has 0771 permissions, and the optimized DEX files stored there are owned by system and the application‘s group, with 0644 permissions. DRM-locked applications will use 640 permissions to prevent other user applications from examining them. The bottom line is that you can read your own DEX file and those of most other applications, but you cannot create, modify, or remove them.

Preparation of the DEX file for the "just in time" and "system installer" approaches proceeds in three steps:

First, the dalvik-cache file is created. This must be done in a process with appropriate privileges, so for the "system installer" case this is done within installd, which runs as root.

Second, the classes.dex entry is extracted from the the zip archive. A small amount of space is left at the start of the file for the ODEX header.

Third, the file is memory-mapped for easy access and tweaked for use on the current system. This includes byte-swapping and structure realigning, but no meaningful changes to the DEX file. We also do some basic structure checks, such as ensuring that file offsets and data indices fall within valid ranges.

The build system uses a hairy process that involves starting the emulator, forcing just-in-time optimization of all relevant DEX files, and then extracting the results from dalvik-cache. The reasons for doing this, rather than using a tool that runs on the desktop, will become more apparent when the optimizations are explained.

Once the code is byte-swapped and aligned, we‘re ready to go. We append some pre-computed data, fill in the ODEX header at the start of the file, and start executing. (The header is filled in last, so that we don‘t try to use a partial file.) If we‘re interested in verification and optimization, however, we need to insert a step after the initial prep.

dexopt

We want to verify and optimize all of the classes in the DEX file. The easiest and safest way to do this is to load all of the classes into the VM and run through them. Anything that fails to load is simply not verified or optimized. Unfortunately, this can cause allocation of some resources that are difficult to release (e.g. loading of native shared libraries), so we don‘t want to do it in the same virtual machine that we‘re running applications in.

The solution is to invoke a program called dexopt, which is really just a back door into the VM. It performs an abbreviated VM initialization, loads zero or more DEX files from the bootstrap class path, and then sets about verifying and optimizing whatever it can from the target DEX. On completion, the process exits, freeing all resources.

It is possible for multiple VMs to want the same DEX file at the same time. File locking is used to ensure that dexopt is only run once.

Verification

The bytecode verification process involves scanning through the instructions in every method in every class in a DEX file. The goal is to identify illegal instruction sequences so that we don‘t have to check for them at run time. Many of the computations involved are also necessary for "exact" garbage collection. See Dalvik Bytecode Verifier Notes for more information.

For performance reasons, the optimizer (described in the next section) assumes that the verifier has run successfully, and makes some potentially unsafe assumptions. By default, Dalvik insists upon verifying all classes, and only optimizes classes that have been verified. If you want to disable the verifier, you can use command-line flags to do so. See also Controlling the Embedded VM for instructions on controlling these features within the Android application framework.

Reporting of verification failures is a tricky issue. For example, calling a package-scope method on a class in a different package is illegal and will be caught by the verifier. We don‘t necessarily want to report it during verification though -- we actually want to throw an exception when the method call is attempted. Checking the access flags on every method call is expensive though. The Dalvik Bytecode Verifier Notes document addresses this issue.

Classes that have been verified successfully have a flag set in the ODEX. They will not be re-verified when loaded. The Linux access permissions are expected to prevent tampering; if you can get around those, installing faulty bytecode is far from the easiest line of attack. The ODEX file has a 32-bit checksum, but that‘s chiefly present as a quick check for corrupted data.

Optimization

Virtual machine interpreters typically perform certain optimizations the first time a piece of code is used. Constant pool references are replaced with pointers to internal data structures, operations that always succeed or always work a certain way are replaced with simpler forms. Some of these require information only available at runtime, others can be inferred statically when certain assumptions are made.

The Dalvik optimizer does the following:

  • For virtual method calls, replace the method index with a vtable index.

  • For instance field get/put, replace the field index with a byte offset. Also, merge the boolean / byte / char / short variants into a single 32-bit form (less code in the interpreter means more room in the CPU I-cache).

  • Replace a handful of high-volume calls, like String.length(), with "inline" replacements. This skips the usual method call overhead, directly switching from the interpreter to a native implementation.

  • Prune empty methods. The simplest example is Object., which does nothing, but must be called whenever any object is allocated. The instruction is replaced with a new version that acts as a no-op unless a debugger is attached.

  • Append pre-computed data. For example, the VM wants to have a hash table for lookups on class name. Instead of computing this when the DEX file is loaded, we can compute it now, saving heap space and computation time in every VM where the DEX is loaded.

All of the instruction modifications involve replacing the opcode with one not defined by the Dalvik specification. This allows us to freely mix optimized and unoptimized instructions. The set of optimized instructions, and their exact representation, is tied closely to the VM version.

Most of the optimizations are obvious "wins". The use of raw indices and offsets not only allows us to execute more quickly, we can also skip the initial symbolic resolution. Pre-computation eats up disk space, and so must be done in moderation.

There are a couple of potential sources of trouble with these optimizations. First, vtable indices and byte offsets are subject to change if the VM is updated. Second, if a superclass is in a different DEX, and that other DEX is updated, we need to ensure that our optimized indices and offsets are updated as well. A similar but more subtle problem emerges when user-defined class loaders are employed: the class we actually call may not be the one we expected to call.

These problems are addressed with dependency lists and some limitations on what can be optimized.

Dependencies and Limitations

The optimized DEX file includes a list of dependencies on other DEX files, plus the CRC-32 and modification date from the originating classes.dex zip file entry. The dependency list includes the full path to the dalvik-cache file, and the file‘s SHA-1 signature. The timestamps of files on the device are unreliable and not used. The dependency area also includes the VM version number.

An optimized DEX is dependent upon all of the DEX files in the bootstrap class path. DEX files that are part of the bootstrap class path depend upon the DEX files that appeared earlier. To ensure that nothing outside the dependent DEX files is available, dexopt only loads the bootstrap classes. References to classes in other DEX files fail, which causes class loading and/or verification to fail, and classes with external dependencies are simply not optimized.

This means that splitting code out into many separate DEX files has a disadvantage: virtual method calls and instance field lookups between non-boot DEX files can‘t be optimized. Because verification is pass/fail with class granularity, no method in a class that has any reliance on classes in external DEX files can be optimized. This may be a bit heavy-handed, but it‘s the only way to guarantee that nothing breaks when individual pieces are updated.

Another negative consequence: any change to a bootstrap DEX will result in rejection of all optimized DEX files. This makes it hard to keep system updates small.

Despite our caution, there is still a possibility that a class in a DEX file loaded by a user-defined class loader could ask for a bootstrap class (say, String) and be given a different class with the same name. If a class in the DEX file being processed has the same name as a class in the bootstrap DEX files, the class will be flagged as ambiguous and references to it will not be resolved during verification / optimization. The class linking code in the VM does additional checks to plug another hole; see the verbose description in the VM sources for details (vm/oo/Class.c).

If one of the dependencies is updated, we need to re-verify and re-optimize the DEX file. If we can do a just-in-time dexopt invocation, this is easy. If we have to rely on the installer daemon, or the DEX was shipped only in ODEX, then the VM has to reject the DEX.

The output of dexopt is byte-swapped and struct-aligned for the host, and contains indices and offsets that are highly VM-specific (both version-wise and platform-wise). For this reason it‘s tricky to write a version of dexopt that runs on the desktop but generates output suitable for a particular device. The safest way to invoke it is on the target device, or on an emulator for that device.

Generated DEX

Some languages and frameworks rely on the ability to generate bytecode and execute it. The rather heavy dexopt verification and optimization model doesn‘t work well with that.

We intend to support this in a future release, but the exact method is to be determined. We may allow individual classes to be added or whole DEX files; may allow Java bytecode or Dalvik bytecode in instructions; may perform the usual set of optimizations, or use a separate interpreter that performs on-first-use optimizations directly on the bytecode (which won‘t be mapped read-only, since it‘s locally defined).

Copyright ? 2008 The Android Open Source Project


推荐阅读
  • 作为软件工程专业的学生,我深知课堂上教师讲解速度之快,很多时候需要课后自行消化和巩固。因此,撰写这篇Java Web开发入门教程,旨在帮助初学者更好地理解和掌握基础知识。通过详细记录学习过程,希望能为更多像我一样在基础方面还有待提升的学员提供有益的参考。 ... [详细]
  • 单链表的高效遍历及性能优化策略
    本文探讨了单链表的高效遍历方法及其性能优化策略。在单链表的数据结构中,插入操作的时间复杂度为O(n),而遍历操作的时间复杂度为O(n^2)。通过在 `LinkList.h` 和 `main.cpp` 文件中对单链表进行封装,我们实现了创建和销毁功能的优化,提高了单链表的使用效率。此外,文章还介绍了几种常见的优化技术,如缓存节点指针和批量处理,以进一步提升遍历性能。 ... [详细]
  • C# .NET 4.1 版本大型信息化系统集成平台中的主从表事务处理标准示例
    在C# .NET 4.1版本的大型信息化系统集成平台中,本文详细介绍了主从表事务处理的标准示例。通过确保所有操作要么全部成功,要么全部失败,实现主表和关联子表的同步插入。主表插入时会返回当前生成的主键,该主键随后用于子表插入时的关联。以下是一个示例代码片段,展示了如何在一个数据库事务中同时添加角色和相关用户。 ... [详细]
  • DRF框架中Serializer反序列化验证机制详解:深入探讨Validators的应用与优化
    在DRF框架的反序列化验证机制中,除了基本的字段类型和长度校验外,还常常需要进行更为复杂的条件限制校验。通过引入`validators`模块,可以实现自定义校验逻辑,如唯一字段校验等。本文将详细探讨`validators`的使用方法及其优化策略,帮助开发者更好地理解和应用这一重要功能。 ... [详细]
  • 本文深入解析了Java面向对象编程的核心概念及其应用,重点探讨了面向对象的三大特性:封装、继承和多态。封装确保了数据的安全性和代码的可维护性;继承支持代码的重用和扩展;多态则增强了程序的灵活性和可扩展性。通过具体示例,文章详细阐述了这些特性在实际开发中的应用和优势。 ... [详细]
  • 在探讨Hibernate框架的高级特性时,缓存机制和懒加载策略是提升数据操作效率的关键要素。缓存策略能够显著减少数据库访问次数,从而提高应用性能,特别是在处理频繁访问的数据时。Hibernate提供了多层次的缓存支持,包括一级缓存和二级缓存,以满足不同场景下的需求。懒加载策略则通过按需加载关联对象,进一步优化了资源利用和响应时间。本文将深入分析这些机制的实现原理及其最佳实践。 ... [详细]
  • 初探性能优化:入门指南与实践技巧
    在编程领域,常有“尚未精通编码便急于优化”的声音。为了从性能优化的角度提升代码质量,本文将带领读者初步探索性能优化的基本概念与实践技巧。即使程序看似运行良好,数据处理效率仍有待提高,通过系统学习性能优化,能够帮助开发者编写更加高效、稳定的代码。文章不仅介绍了性能优化的基础知识,还提供了实用的调优方法和工具,帮助读者在实际项目中应用这些技术。 ... [详细]
  • ButterKnife 是一款用于 Android 开发的注解库,主要用于简化视图和事件绑定。本文详细介绍了 ButterKnife 的基础用法,包括如何通过注解实现字段和方法的绑定,以及在实际项目中的应用示例。此外,文章还提到了截至 2016 年 4 月 29 日,ButterKnife 的最新版本为 8.0.1,为开发者提供了最新的功能和性能优化。 ... [详细]
  • 本文详细介绍了使用 Python 进行 MySQL 和 Redis 数据库操作的实战技巧。首先,针对 MySQL 数据库,通过 `pymysql` 模块展示了如何连接和操作数据库,包括建立连接、执行查询和更新等常见操作。接着,文章深入探讨了 Redis 的基本命令和高级功能,如键值存储、列表操作和事务处理。此外,还提供了多个实际案例,帮助读者更好地理解和应用这些技术。 ... [详细]
  • 在众多市场调研公司中,如何选择一家值得信赖的合作伙伴至关重要。基于我在市场调查行业近二十年的经验,我将推荐几家国内知名的市场调研机构,供您参考:1. 开元研究——专注于零售报刊发行研究、媒体广告价值评估及网络营销分析等领域,以其专业性和准确性赢得了广泛认可。 ... [详细]
  • POJ3669题目解析:基于广度优先搜索的详细解答
    POJ3669(http://poj.org/problem?id=3669)是一道典型的广度优先搜索(BFS)问题。由于陨石的降落具有时间属性,导致地图状态会随时间动态变化。因此,可以利用结构体来记录每个陨石的降落时间和位置,从而有效地进行状态更新和路径搜索。 ... [详细]
  • iOS 设备唯一标识获取的高效解决方案与实践
    在iOS 7中,苹果公司再次禁止了对MAC地址的访问,使得开发者无法直接获取设备的物理地址。为了在开发过程中实现设备的唯一标识,苹果推荐使用Keychain服务来存储和管理唯一的标识符。此外,还可以结合其他技术手段,如UUID和广告标识符(IDFA),以确保设备的唯一性和安全性。这些方法不仅能够满足应用的需求,还能保护用户的隐私。 ... [详细]
  • 本文探讨了如何有效地构建和优化微信公众平台账号,涵盖了用户信息管理、内容创作与发布、互动策略及数据分析等方面。通过合理设置用户信息字段,如用户名、昵称、密码、真实姓名和性别等,确保账号的安全性和用户体验。同时,文章还介绍了如何利用微信公众平台的各项功能,提升用户参与度和品牌影响力。 ... [详细]
  • 如何高效地安装并配置 PostgreSQL 数据库系统?本文将详细介绍从下载到安装、配置环境变量、初始化数据库、以及优化性能的全过程,帮助读者快速掌握 PostgreSQL 的核心操作与最佳实践。文章还涵盖了常见问题的解决方案,确保用户在部署过程中能够顺利解决遇到的各种挑战。 ... [详细]
  • 深入解析InnoDB中的多版本并发控制机制
    多版本并发控制(MVCC)是InnoDB存储引擎中的一项关键技术,通过维护数据在不同时间点的多个版本,确保了事务的隔离性和一致性。每个读取操作都能获得一个与事务启动时一致的数据视图,从而提高了并发性能并减少了锁竞争。此外,MVCC还支持多种隔离级别,如可重复读和读已提交,进一步增强了系统的灵活性和可靠性。 ... [详细]
author-avatar
孩子气zyj2
这个家伙很懒,什么也没留下!
PHP1.CN | 中国最专业的PHP中文社区 | DevBox开发工具箱 | json解析格式化 |PHP资讯 | PHP教程 | 数据库技术 | 服务器技术 | 前端开发技术 | PHP框架 | 开发工具 | 在线工具
Copyright © 1998 - 2020 PHP1.CN. All Rights Reserved | 京公网安备 11010802041100号 | 京ICP备19059560号-4 | PHP1.CN 第一PHP社区 版权所有