本文屬于Java ASM系列三:Tree API當(dāng)中的一篇。

1. 如何判斷變量是否冗余

如果在IntelliJ IDEA當(dāng)中編寫(xiě)如下的代碼,它會(huì)提示str2str3局部變量是多余的:

public class HelloWorld {    public void test() {        String str1 = "Hello ASM";        Object obj1 = new Object();        // Local variable "str2" is redundant        String str2 = str1;        Object obj2 = new Object();        // Local variable "str3" is redundant        String str3 = str2;        Object obj3 = new Object();        int length = str3.length();        System.out.println(length);    }}

1.1. 整體思路

結(jié)合AnalyzerSimpleVerifier類,我們可以查看Frame的變化情況:

test:()V000:                         ldc "Hello ASM"    {HelloWorld, ., ., ., ., ., ., .} | {}001:                                astore_1    {HelloWorld, ., ., ., ., ., ., .} | {String}002:                              new Object    {HelloWorld, String, ., ., ., ., ., .} | {}003:                                     dup    {HelloWorld, String, ., ., ., ., ., .} | {Object}004:             invokespecial Object.    {HelloWorld, String, ., ., ., ., ., .} | {Object, Object}005:                                astore_2    {HelloWorld, String, ., ., ., ., ., .} | {Object}006:                                 aload_1    {HelloWorld, String, Object, ., ., ., ., .} | {}007:                                astore_3    {HelloWorld, String, Object, ., ., ., ., .} | {String}008:                              new Object    {HelloWorld, String, Object, String, ., ., ., .} | {}009:                                     dup    {HelloWorld, String, Object, String, ., ., ., .} | {Object}010:             invokespecial Object.    {HelloWorld, String, Object, String, ., ., ., .} | {Object, Object}011:                                astore 4    {HelloWorld, String, Object, String, ., ., ., .} | {Object}012:                                 aload_3    {HelloWorld, String, Object, String, Object, ., ., .} | {}013:                                astore 5    {HelloWorld, String, Object, String, Object, ., ., .} | {String}014:                              new Object    {HelloWorld, String, Object, String, Object, String, ., .} | {}015:                                     dup    {HelloWorld, String, Object, String, Object, String, ., .} | {Object}016:             invokespecial Object.    {HelloWorld, String, Object, String, Object, String, ., .} | {Object, Object}017:                                astore 6    {HelloWorld, String, Object, String, Object, String, ., .} | {Object}018:                                 aload 5    {HelloWorld, String, Object, String, Object, String, Object, .} | {}019:             invokevirtual String.length    {HelloWorld, String, Object, String, Object, String, Object, .} | {String}020:                                istore 7    {HelloWorld, String, Object, String, Object, String, Object, .} | {I}021:                    getstatic System.out    {HelloWorld, String, Object, String, Object, String, Object, I} | {}022:                                 iload 7    {HelloWorld, String, Object, String, Object, String, Object, I} | {PrintStream}023:       invokevirtual PrintStream.println    {HelloWorld, String, Object, String, Object, String, Object, I} | {PrintStream, I}024:                                  return    {HelloWorld, String, Object, String, Object, String, Object, I} | {}================================================================

我們的整體思路是這樣的:

  • 在每一個(gè)Frame當(dāng)中,它有l(wèi)ocal variable和operand stack兩部分組成。
  • 程序中定義的“變量”是存儲(chǔ)在local variable當(dāng)中。
  • 在理想的情況下,一個(gè)“變量”對(duì)應(yīng)于local variable當(dāng)中的一個(gè)位置;如果一個(gè)“變量”對(duì)應(yīng)于local variable當(dāng)中的兩個(gè)或多個(gè)位置,那么我們就認(rèn)為“變量”出現(xiàn)了冗余。

那么,針對(duì)某一個(gè)具體的frame,相應(yīng)的實(shí)現(xiàn)思路上是這樣的:

  • 判斷local[0]local[1]是否相同,如果相同,那么表示local[1]是冗余的變量。
  • 判斷local[0]local[2]是否相同,如果相同,那么表示local[2]是冗余的變量。
  • ...
  • 判斷local[0]local[n]是否相同,如果相同,那么表示local[n]是冗余的變量。
  • 判斷local[1]local[2]是否相同,如果相同,那么表示local[2]是冗余的變量。
  • 判斷local[1]local[3]是否相同,如果相同,那么表示local[3]是冗余的變量。
  • ...

需要注意的一點(diǎn)就是,如果local variable當(dāng)中存儲(chǔ)“未初始化的值”(BasicValue.UNINITIALIZED_VALUE),那么我們就不進(jìn)行處理了。

具體來(lái)說(shuō),“未初始化的值”(BasicValue.UNINITIALIZED_VALUE)有兩種情況:

  • 第一種情況,在方法剛進(jìn)入的時(shí)候,local variable有些位置存儲(chǔ)的就是“未初始化的值”(BasicValue.UNINITIALIZED_VALUE)。
  • 第二種情況,在存儲(chǔ)longdouble類型的數(shù)據(jù)時(shí),它占用兩個(gè)位置,其中第二個(gè)位置就是“未初始化的值”(BasicValue.UNINITIALIZED_VALUE)。

1.2. 為什么選擇SimpleVerifier

在ASM當(dāng)中,Interpreter類是一個(gè)抽象類,其中提供的子類有BasicInterpreterBasicVerifier、SimpleVerifierSourceInterpreter類。那么,我們到底應(yīng)該選擇哪一個(gè)呢?

┌───┬───────────────────┬─────────────┬───────┐│ 0 │    Interpreter    │    Value    │ Range │├───┼───────────────────┼─────────────┼───────┤│ 1 │ BasicInterpreter  │ BasicValue  │   7   │├───┼───────────────────┼─────────────┼───────┤│ 2 │   BasicVerifier   │ BasicValue  │   7   │├───┼───────────────────┼─────────────┼───────┤│ 3 │  SimpleVerifier   │ BasicValue  │   N   │├───┼───────────────────┼─────────────┼───────┤│ 4 │ SourceInterpreter │ SourceValue │   N   │└───┴───────────────────┴─────────────┴───────┘

首先,不能選擇BasicInterpreterBasicVerifier類。因?yàn)樗鼈兪褂?個(gè)值(BasicValue類定義的7個(gè)靜態(tài)字段)來(lái)模擬Frame的變化,這7個(gè)值的“表達(dá)能力”很弱。如果一個(gè)對(duì)象是String類型,另一個(gè)對(duì)象是Object類型,這兩個(gè)對(duì)象都會(huì)被表示成BasicValue.REFERENCE_VALUE,沒(méi)有辦法進(jìn)行區(qū)分。

其次,不能選擇SourceInterpreter類。因?yàn)樗x的copyOperation方法中會(huì)創(chuàng)建一個(gè)新的對(duì)象(new SourceValue(value.getSize(), insn)),不能識(shí)別為同一個(gè)對(duì)象。

public class SourceInterpreter extends Interpreter implements Opcodes {    @Override    public SourceValue copyOperation(final AbstractInsnNode insn, final SourceValue value) {        return new SourceValue(value.getSize(), insn);    }}

為什么要關(guān)注這個(gè)copyOperation方法呢?因?yàn)?code>copyOperation方法負(fù)責(zé)處理load和store相關(guān)的指令。

public abstract class Interpreter {    /**     * Interprets a bytecode instruction that moves a value on the stack or to or from local variables.     * This method is called for the following opcodes:     *     * ILOAD, LLOAD, FLOAD, DLOAD, ALOAD,     * ISTORE, LSTORE, FSTORE, DSTORE, ASTORE,     * DUP, DUP_X1, DUP_X2, DUP2, DUP2_X1, DUP2_X2, SWAP     *     */    public abstract V copyOperation(AbstractInsnNode insn, V value) throws AnalyzerException;}

最后,選擇SimpleVerifier是合適的。一方面,它能區(qū)分不同的類型(class)、區(qū)分不同的對(duì)象實(shí)例(object instance);另一方面,在copyOperation方法中保證了對(duì)象的一致性,傳入的是value,返回的仍然是value。更準(zhǔn)確的來(lái)說(shuō),SimpleVerifier是繼承了父類BasicVerifier類的copyOperation方法。

public class BasicVerifier extends BasicInterpreter {    @Override    public BasicValue copyOperation(final AbstractInsnNode insn, final BasicValue value)            throws AnalyzerException {        //...        return value;    }}

2. 示例:冗余變量分析

2.1. 預(yù)期目標(biāo)

在下面的代碼中,會(huì)提示str2str3局部變量是多余的:

public class HelloWorld {    public void test() {        String str1 = "Hello ASM";        Object obj1 = new Object();        // Local variable "str2" is redundant        String str2 = str1;        Object obj2 = new Object();        // Local variable "str3" is redundant        String str3 = str2;        Object obj3 = new Object();        int length = str3.length();        System.out.println(length);    }}

我們的預(yù)期目標(biāo):識(shí)別出str2str3是冗余變量。

2.2. 編碼實(shí)現(xiàn)

import org.objectweb.asm.Opcodes;import org.objectweb.asm.tree.AbstractInsnNode;import org.objectweb.asm.tree.InsnList;import org.objectweb.asm.tree.MethodNode;import org.objectweb.asm.tree.VarInsnNode;import org.objectweb.asm.tree.analysis.*;import java.util.Arrays;public class RedundantVariableDiagnosis {    public static int[] diagnose(String className, MethodNode mn) throws AnalyzerException {        // 第一步,準(zhǔn)備工作。使用SimpleVerifier進(jìn)行分析,得到frames信息        Analyzer analyzer = new Analyzer<>(new SimpleVerifier());        Frame[] frames = analyzer.analyze(className, mn);        // 第二步,利用frames信息,查看local variable當(dāng)中哪些slot數(shù)據(jù)出現(xiàn)了冗余        TIntArrayList localIndexList = new TIntArrayList();        for (Frame f : frames) {            int locals = f.getLocals();            for (int i = 0; i < locals; i++) {                BasicValue val1 = f.getLocal(i);                if (val1 == BasicValue.UNINITIALIZED_VALUE) {                    continue;                }                for (int j = i + 1; j < locals; j++) {                    BasicValue val2 = f.getLocal(j);                    if (val2 == BasicValue.UNINITIALIZED_VALUE) {                        continue;                    }                    if (val1 == val2) {                        if (!localIndexList.contains(j)) {                            localIndexList.add(j);                        }                    }                }            }        }        // 第三步,將slot的索引值(local index)轉(zhuǎn)換成instruction的索引值(insn index)        TIntArrayList insnIndexList = new TIntArrayList();        InsnList instructions = mn.instructions;        int size = instructions.size();        for (int i = 0; i < size; i++) {            AbstractInsnNode node = instructions.get(i);            int opcode = node.getOpcode();            if (opcode >= Opcodes.ISTORE && opcode <= Opcodes.ASTORE) {                VarInsnNode varInsnNode = (VarInsnNode) node;                if (localIndexList.contains(varInsnNode.var)) {                    if (!insnIndexList.contains(i)) {                        insnIndexList.add(i);                    }                }            }        }        // 第四步,將insnIndexList轉(zhuǎn)換成int[]形式        int[] array = insnIndexList.toNativeArray();        Arrays.sort(array);        return array;    }}

2.3. 進(jìn)行分析

public class HelloWorldAnalysisTree {    public static void main(String[] args) throws Exception {        String relative_path = "sample/HelloWorld.class";        String filepath = FileUtils.getFilePath(relative_path);        byte[] bytes = FileUtils.readBytes(filepath);        //(1)構(gòu)建ClassReader        ClassReader cr = new ClassReader(bytes);        //(2)生成ClassNode        int api = Opcodes.ASM9;        ClassNode cn = new ClassNode(api);        int parsingOptions = ClassReader.SKIP_DEBUG | ClassReader.SKIP_FRAMES;        cr.accept(cn, parsingOptions);        //(3)進(jìn)行分析        List methods = cn.methods;        MethodNode mn = methods.get(1);        int[] array = RedundantVariableDiagnosis.diagnose(cn.name, mn);        System.out.println(Arrays.toString(array));        BoxDrawingUtils.printInstructionLinks(mn.instructions, array);    }}

輸出結(jié)果:

[7, 13]      000: ldc "Hello ASM"      001: astore_1      002: new Object      003: dup      004: invokespecial Object.      005: astore_2      006: aload_1┌──── 007: astore_3│     008: new Object│     009: dup│     010: invokespecial Object.│     011: astore 4│     012: aload_3└──── 013: astore 5      014: new Object      015: dup      016: invokespecial Object.      017: astore 6      018: aload 5      019: invokevirtual String.length      020: istore 7      021: getstatic System.out      022: iload 7      023: invokevirtual PrintStream.println      024: return

3. 測(cè)試用例

3.1. primitive type - no

本文介紹的方法不適合對(duì)primitive type進(jìn)行分析:

  • 所有int類型的值都用BasicValue.INT_VALUE表示,不能對(duì)兩個(gè)不同的值進(jìn)行區(qū)分
  • 所有float類型的值都用BasicValue.FLOAT_VALUE表示,不能對(duì)兩個(gè)不同的值進(jìn)行區(qū)分
  • 所有long類型的值都用BasicValue.LONG_VALUE表示,不能對(duì)兩個(gè)不同的值進(jìn)行區(qū)分
  • 所有double類型的值都用BasicValue.DOUBLE_VALUE表示,不能對(duì)兩個(gè)不同的值進(jìn)行區(qū)分
public class HelloWorld {    public void test() {        int a = 1;        int b = 2;        int c = a + b;        int d = a - b;        int e = c * d;        System.out.println(e);    }}

輸出結(jié)果(錯(cuò)誤):

[3, 7, 11, 15]      000: iconst_1      001: istore_1      002: iconst_2┌──── 003: istore_2│     004: iload_1│     005: iload_2│     006: iadd├──── 007: istore_3│     008: iload_1│     009: iload_2│     010: isub├──── 011: istore 4│     012: iload_3│     013: iload 4│     014: imul└──── 015: istore 5      016: getstatic System.out      017: iload 5      018: invokevirtual PrintStream.println      019: return

3.2. return-no

本文介紹的方法也不適用于return語(yǔ)句的判斷。在下面的代碼中,會(huì)提示result局部變量是多余的:

public class HelloWorld {    public Object test() {        // Local variable "result" is redundant        Object result = new Object();        return result;    }}

我覺(jué)得,可以使用astore aload areturn的指令組合來(lái)識(shí)別這種情況,不一定要使用Frame的分析做到。

4. 總結(jié)

本文內(nèi)容總結(jié)如下:

  • 第一點(diǎn),如何判斷一個(gè)變量是否冗余呢?看看local variable當(dāng)中是否有兩個(gè)或多個(gè)相同的值。
  • 第二點(diǎn),代碼示例,編碼實(shí)現(xiàn)冗余變量分析。