本文屬于Java ASM系列三:Tree API當(dāng)中的一篇。
1. 如何判斷變量是否冗余
如果在IntelliJ IDEA當(dāng)中編寫(xiě)如下的代碼,它會(huì)提示str2
和str3
局部變量是多余的:
public class HelloWorld { public void test() { String str1 = "Hello ASM"; Object obj1 = new Object(); // Local variable "str2" is redundant String str2 = str1; Object obj2 = new Object(); // Local variable "str3" is redundant String str3 = str2; Object obj3 = new Object(); int length = str3.length(); System.out.println(length); }}
1.1. 整體思路
結(jié)合Analyzer
和SimpleVerifier
類,我們可以查看Frame的變化情況:
test:()V000: ldc "Hello ASM" {HelloWorld, ., ., ., ., ., ., .} | {}001: astore_1 {HelloWorld, ., ., ., ., ., ., .} | {String}002: new Object {HelloWorld, String, ., ., ., ., ., .} | {}003: dup {HelloWorld, String, ., ., ., ., ., .} | {Object}004: invokespecial Object. {HelloWorld, String, ., ., ., ., ., .} | {Object, Object}005: astore_2 {HelloWorld, String, ., ., ., ., ., .} | {Object}006: aload_1 {HelloWorld, String, Object, ., ., ., ., .} | {}007: astore_3 {HelloWorld, String, Object, ., ., ., ., .} | {String}008: new Object {HelloWorld, String, Object, String, ., ., ., .} | {}009: dup {HelloWorld, String, Object, String, ., ., ., .} | {Object}010: invokespecial Object. {HelloWorld, String, Object, String, ., ., ., .} | {Object, Object}011: astore 4 {HelloWorld, String, Object, String, ., ., ., .} | {Object}012: aload_3 {HelloWorld, String, Object, String, Object, ., ., .} | {}013: astore 5 {HelloWorld, String, Object, String, Object, ., ., .} | {String}014: new Object {HelloWorld, String, Object, String, Object, String, ., .} | {}015: dup {HelloWorld, String, Object, String, Object, String, ., .} | {Object}016: invokespecial Object. {HelloWorld, String, Object, String, Object, String, ., .} | {Object, Object}017: astore 6 {HelloWorld, String, Object, String, Object, String, ., .} | {Object}018: aload 5 {HelloWorld, String, Object, String, Object, String, Object, .} | {}019: invokevirtual String.length {HelloWorld, String, Object, String, Object, String, Object, .} | {String}020: istore 7 {HelloWorld, String, Object, String, Object, String, Object, .} | {I}021: getstatic System.out {HelloWorld, String, Object, String, Object, String, Object, I} | {}022: iload 7 {HelloWorld, String, Object, String, Object, String, Object, I} | {PrintStream}023: invokevirtual PrintStream.println {HelloWorld, String, Object, String, Object, String, Object, I} | {PrintStream, I}024: return {HelloWorld, String, Object, String, Object, String, Object, I} | {}================================================================
我們的整體思路是這樣的:
- 在每一個(gè)Frame當(dāng)中,它有l(wèi)ocal variable和operand stack兩部分組成。
- 程序中定義的“變量”是存儲(chǔ)在local variable當(dāng)中。
- 在理想的情況下,一個(gè)“變量”對(duì)應(yīng)于local variable當(dāng)中的一個(gè)位置;如果一個(gè)“變量”對(duì)應(yīng)于local variable當(dāng)中的兩個(gè)或多個(gè)位置,那么我們就認(rèn)為“變量”出現(xiàn)了冗余。
那么,針對(duì)某一個(gè)具體的frame,相應(yīng)的實(shí)現(xiàn)思路上是這樣的:
- 判斷
local[0]
和local[1]
是否相同,如果相同,那么表示local[1]
是冗余的變量。 - 判斷
local[0]
和local[2]
是否相同,如果相同,那么表示local[2]
是冗余的變量。 - ...
- 判斷
local[0]
和local[n]
是否相同,如果相同,那么表示local[n]
是冗余的變量。 - 判斷
local[1]
和local[2]
是否相同,如果相同,那么表示local[2]
是冗余的變量。 - 判斷
local[1]
和local[3]
是否相同,如果相同,那么表示local[3]
是冗余的變量。 - ...
需要注意的一點(diǎn)就是,如果local variable當(dāng)中存儲(chǔ)“未初始化的值”(BasicValue.UNINITIALIZED_VALUE
),那么我們就不進(jìn)行處理了。
具體來(lái)說(shuō),“未初始化的值”(BasicValue.UNINITIALIZED_VALUE
)有兩種情況:
- 第一種情況,在方法剛進(jìn)入的時(shí)候,local variable有些位置存儲(chǔ)的就是“未初始化的值”(
BasicValue.UNINITIALIZED_VALUE
)。 - 第二種情況,在存儲(chǔ)
long
和double
類型的數(shù)據(jù)時(shí),它占用兩個(gè)位置,其中第二個(gè)位置就是“未初始化的值”(BasicValue.UNINITIALIZED_VALUE
)。
1.2. 為什么選擇SimpleVerifier
在ASM當(dāng)中,Interpreter
類是一個(gè)抽象類,其中提供的子類有BasicInterpreter
、BasicVerifier
、SimpleVerifier
和SourceInterpreter
類。那么,我們到底應(yīng)該選擇哪一個(gè)呢?
┌───┬───────────────────┬─────────────┬───────┐│ 0 │ Interpreter │ Value │ Range │├───┼───────────────────┼─────────────┼───────┤│ 1 │ BasicInterpreter │ BasicValue │ 7 │├───┼───────────────────┼─────────────┼───────┤│ 2 │ BasicVerifier │ BasicValue │ 7 │├───┼───────────────────┼─────────────┼───────┤│ 3 │ SimpleVerifier │ BasicValue │ N │├───┼───────────────────┼─────────────┼───────┤│ 4 │ SourceInterpreter │ SourceValue │ N │└───┴───────────────────┴─────────────┴───────┘
首先,不能選擇BasicInterpreter
和BasicVerifier
類。因?yàn)樗鼈兪褂?個(gè)值(BasicValue
類定義的7個(gè)靜態(tài)字段)來(lái)模擬Frame的變化,這7個(gè)值的“表達(dá)能力”很弱。如果一個(gè)對(duì)象是String
類型,另一個(gè)對(duì)象是Object
類型,這兩個(gè)對(duì)象都會(huì)被表示成BasicValue.REFERENCE_VALUE
,沒(méi)有辦法進(jìn)行區(qū)分。
其次,不能選擇SourceInterpreter
類。因?yàn)樗x的copyOperation
方法中會(huì)創(chuàng)建一個(gè)新的對(duì)象(new SourceValue(value.getSize(), insn)
),不能識(shí)別為同一個(gè)對(duì)象。
public class SourceInterpreter extends Interpreter implements Opcodes { @Override public SourceValue copyOperation(final AbstractInsnNode insn, final SourceValue value) { return new SourceValue(value.getSize(), insn); }}
為什么要關(guān)注這個(gè)copyOperation
方法呢?因?yàn)?code>copyOperation方法負(fù)責(zé)處理load和store相關(guān)的指令。
public abstract class Interpreter { /** * Interprets a bytecode instruction that moves a value on the stack or to or from local variables. * This method is called for the following opcodes: * * ILOAD, LLOAD, FLOAD, DLOAD, ALOAD, * ISTORE, LSTORE, FSTORE, DSTORE, ASTORE, * DUP, DUP_X1, DUP_X2, DUP2, DUP2_X1, DUP2_X2, SWAP * */ public abstract V copyOperation(AbstractInsnNode insn, V value) throws AnalyzerException;}
最后,選擇SimpleVerifier
是合適的。一方面,它能區(qū)分不同的類型(class)、區(qū)分不同的對(duì)象實(shí)例(object instance);另一方面,在copyOperation
方法中保證了對(duì)象的一致性,傳入的是value
,返回的仍然是value
。更準(zhǔn)確的來(lái)說(shuō),SimpleVerifier
是繼承了父類BasicVerifier
類的copyOperation
方法。
public class BasicVerifier extends BasicInterpreter { @Override public BasicValue copyOperation(final AbstractInsnNode insn, final BasicValue value) throws AnalyzerException { //... return value; }}
2. 示例:冗余變量分析
2.1. 預(yù)期目標(biāo)
在下面的代碼中,會(huì)提示str2
和str3
局部變量是多余的:
public class HelloWorld { public void test() { String str1 = "Hello ASM"; Object obj1 = new Object(); // Local variable "str2" is redundant String str2 = str1; Object obj2 = new Object(); // Local variable "str3" is redundant String str3 = str2; Object obj3 = new Object(); int length = str3.length(); System.out.println(length); }}
我們的預(yù)期目標(biāo):識(shí)別出str2
和str3
是冗余變量。
2.2. 編碼實(shí)現(xiàn)
import org.objectweb.asm.Opcodes;import org.objectweb.asm.tree.AbstractInsnNode;import org.objectweb.asm.tree.InsnList;import org.objectweb.asm.tree.MethodNode;import org.objectweb.asm.tree.VarInsnNode;import org.objectweb.asm.tree.analysis.*;import java.util.Arrays;public class RedundantVariableDiagnosis { public static int[] diagnose(String className, MethodNode mn) throws AnalyzerException { // 第一步,準(zhǔn)備工作。使用SimpleVerifier進(jìn)行分析,得到frames信息 Analyzer analyzer = new Analyzer<>(new SimpleVerifier()); Frame[] frames = analyzer.analyze(className, mn); // 第二步,利用frames信息,查看local variable當(dāng)中哪些slot數(shù)據(jù)出現(xiàn)了冗余 TIntArrayList localIndexList = new TIntArrayList(); for (Frame f : frames) { int locals = f.getLocals(); for (int i = 0; i < locals; i++) { BasicValue val1 = f.getLocal(i); if (val1 == BasicValue.UNINITIALIZED_VALUE) { continue; } for (int j = i + 1; j < locals; j++) { BasicValue val2 = f.getLocal(j); if (val2 == BasicValue.UNINITIALIZED_VALUE) { continue; } if (val1 == val2) { if (!localIndexList.contains(j)) { localIndexList.add(j); } } } } } // 第三步,將slot的索引值(local index)轉(zhuǎn)換成instruction的索引值(insn index) TIntArrayList insnIndexList = new TIntArrayList(); InsnList instructions = mn.instructions; int size = instructions.size(); for (int i = 0; i < size; i++) { AbstractInsnNode node = instructions.get(i); int opcode = node.getOpcode(); if (opcode >= Opcodes.ISTORE && opcode <= Opcodes.ASTORE) { VarInsnNode varInsnNode = (VarInsnNode) node; if (localIndexList.contains(varInsnNode.var)) { if (!insnIndexList.contains(i)) { insnIndexList.add(i); } } } } // 第四步,將insnIndexList轉(zhuǎn)換成int[]形式 int[] array = insnIndexList.toNativeArray(); Arrays.sort(array); return array; }}
2.3. 進(jìn)行分析
public class HelloWorldAnalysisTree { public static void main(String[] args) throws Exception { String relative_path = "sample/HelloWorld.class"; String filepath = FileUtils.getFilePath(relative_path); byte[] bytes = FileUtils.readBytes(filepath); //(1)構(gòu)建ClassReader ClassReader cr = new ClassReader(bytes); //(2)生成ClassNode int api = Opcodes.ASM9; ClassNode cn = new ClassNode(api); int parsingOptions = ClassReader.SKIP_DEBUG | ClassReader.SKIP_FRAMES; cr.accept(cn, parsingOptions); //(3)進(jìn)行分析 List methods = cn.methods; MethodNode mn = methods.get(1); int[] array = RedundantVariableDiagnosis.diagnose(cn.name, mn); System.out.println(Arrays.toString(array)); BoxDrawingUtils.printInstructionLinks(mn.instructions, array); }}
輸出結(jié)果:
[7, 13] 000: ldc "Hello ASM" 001: astore_1 002: new Object 003: dup 004: invokespecial Object. 005: astore_2 006: aload_1┌──── 007: astore_3│ 008: new Object│ 009: dup│ 010: invokespecial Object.│ 011: astore 4│ 012: aload_3└──── 013: astore 5 014: new Object 015: dup 016: invokespecial Object. 017: astore 6 018: aload 5 019: invokevirtual String.length 020: istore 7 021: getstatic System.out 022: iload 7 023: invokevirtual PrintStream.println 024: return
3. 測(cè)試用例
3.1. primitive type - no
本文介紹的方法不適合對(duì)primitive type進(jìn)行分析:
- 所有
int
類型的值都用BasicValue.INT_VALUE
表示,不能對(duì)兩個(gè)不同的值進(jìn)行區(qū)分 - 所有
float
類型的值都用BasicValue.FLOAT_VALUE
表示,不能對(duì)兩個(gè)不同的值進(jìn)行區(qū)分 - 所有
long
類型的值都用BasicValue.LONG_VALUE
表示,不能對(duì)兩個(gè)不同的值進(jìn)行區(qū)分 - 所有
double
類型的值都用BasicValue.DOUBLE_VALUE
表示,不能對(duì)兩個(gè)不同的值進(jìn)行區(qū)分
public class HelloWorld { public void test() { int a = 1; int b = 2; int c = a + b; int d = a - b; int e = c * d; System.out.println(e); }}
輸出結(jié)果(錯(cuò)誤):
[3, 7, 11, 15] 000: iconst_1 001: istore_1 002: iconst_2┌──── 003: istore_2│ 004: iload_1│ 005: iload_2│ 006: iadd├──── 007: istore_3│ 008: iload_1│ 009: iload_2│ 010: isub├──── 011: istore 4│ 012: iload_3│ 013: iload 4│ 014: imul└──── 015: istore 5 016: getstatic System.out 017: iload 5 018: invokevirtual PrintStream.println 019: return
3.2. return-no
本文介紹的方法也不適用于return
語(yǔ)句的判斷。在下面的代碼中,會(huì)提示result
局部變量是多余的:
public class HelloWorld { public Object test() { // Local variable "result" is redundant Object result = new Object(); return result; }}
我覺(jué)得,可以使用astore aload areturn
的指令組合來(lái)識(shí)別這種情況,不一定要使用Frame的分析做到。
4. 總結(jié)
本文內(nèi)容總結(jié)如下:
- 第一點(diǎn),如何判斷一個(gè)變量是否冗余呢?看看local variable當(dāng)中是否有兩個(gè)或多個(gè)相同的值。
- 第二點(diǎn),代碼示例,編碼實(shí)現(xiàn)冗余變量分析。