Loading binary data
加載二進(jìn)制數(shù)據(jù)使用 load_program
函數(shù),提供一個(gè)文件名,或者字節(jié)字符串
In [1]: import amoco
In [2]: p = amoco.system.loader.load_program(u'samples/x86/flow.elf')
In [3]: print(p)
<amoco.system.linux_x86.ELF object at 0x7f834b4187d0>
In [4]: print(p.bin.Ehdr)
ELF header:
[Elf32_Ehdr]
e_ident :ELF; ELFOSABI_SYSV; 1; ELFCLASS32; ELFDATA2LSB; 0; 127 #占16個(gè)字節(jié)。前四個(gè)字節(jié)被稱(chēng)作ELF的Magic Number。后面的字節(jié)描述了ELF文件內(nèi)容如何解碼等信息
e_type :ET_EXEC #
e_machine :EM_386 #描述了文件面向的架構(gòu)
e_version :EV_CURRENT
e_entry :0x8048380
e_phoff :52 #program header table的offset
e_shoff :4416 # section header table 的offset
e_flags :0x0 # 4字節(jié),特定于處理器的標(biāo)志,32位和64位Intel架構(gòu)都沒(méi)有定義標(biāo)志,因此eflags的值是0。
e_ehsize :52 #2字節(jié),ELF header的大小,32位ELF是52字節(jié),64位是64字節(jié)
e_phentsize :32 #2字節(jié)。program header table中每個(gè)入口的大小。
e_phnum :9 #2字節(jié)。如果文件沒(méi)有program header table, e_phnum的值為0。
e_shentsize :40 #2字節(jié),section header table中entry的大小,即每個(gè)section header占多少字節(jié)。
e_shnum :30 # 2字節(jié),section header table中header的數(shù)目。如果文件沒(méi)有section header table, e_shnum的值為0。e_shentsize乘以e_shnum,就得到了整個(gè)section header table的大小。
e_shstrndx :27 # section header string table index. 包含了section header table中section name string table。如果沒(méi)有section name string table, e_shstrndx的值是SHN_UNDEF.
Symbolic representations of blocks
一個(gè)塊對(duì)象提供了位于內(nèi)存中的一些地址的指令,用符號(hào)化函數(shù)來(lái)表示指令序列的動(dòng)作:
In [7]: print(b.map.view)
eip <- (eip+-0x10)
eflags:
| cf <- 0x0
| sf <- (((esp+0x4)&0xfffffff0)<0x0)
| tf <- tf
| zf <- (((esp+0x4)&0xfffffff0)==0x0)
| pf <- (0x6996>>(((esp[0:8]+0x4)&0xf0)>>0x4)[0:4])[0:1]
| of <- 0x0
| df <- df
| af <- af
ebp <- 0x0
esp <- (((esp+0x4)&0xfffffff0)-0x24)
esi <- M32(esp)
ecx <- (esp+0x4)
(((esp+0x4)&0xfffffff0)-4) <- eax
(((esp+0x4)&0xfffffff0)-8) <- (((esp+0x4)&0xfffffff0)-0x4)
(((esp+0x4)&0xfffffff0)-12) <- edx
(((esp+0x4)&0xfffffff0)-16) <- 0x8048610
(((esp+0x4)&0xfffffff0)-20) <- 0x80485a0
(((esp+0x4)&0xfffffff0)-24) <- (esp+0x4)
(((esp+0x4)&0xfffffff0)-28) <- M32(esp)
(((esp+0x4)&0xfffffff0)-32) <- 0x80484fd
(((esp+0x4)&0xfffffff0)-36) <- (eip+0x21)
- CF(進(jìn)位標(biāo)志) =1 算術(shù)操作最高位產(chǎn)生了進(jìn)位或借位 =0 最高位無(wú)進(jìn)位或借位 ;
- PF(奇偶標(biāo)志) =1 數(shù)據(jù)最低8位中1的個(gè)數(shù)為偶數(shù) =0 數(shù)據(jù)最低8位中1的個(gè)數(shù)為奇數(shù);
- AF(輔助進(jìn)位標(biāo)志) =1 D3→D4位產(chǎn)生了進(jìn)位或借位 =0 D3→D4位無(wú)進(jìn)位或借位;
- ZF(零標(biāo)志) =1 操作結(jié)果為0 =0 結(jié)果不為0;
- SF(符號(hào)標(biāo)志) =1 結(jié)果最高位為1 =0 結(jié)果最高位為0;
- OF(溢出標(biāo)志) =1 此次運(yùn)算發(fā)生了溢出 =0 無(wú)溢出。
class system.core.CoreExec(p, cpu=None)
The CoreExec class implements the base class for a memory mapped binary executable program, providing the generic instruction or data fetchers and the mandatory API used by amoco.main analysis classes. Most of the amoco.system modules use this base class and redefine the initenv(), :meth‘load_binary‘ and helpers methods according to a dedicated system and architecture (Linux/x86, Win32/x86, etc).
意思是說(shuō):
所述CoreExec類(lèi)用于實(shí)現(xiàn)一個(gè)存儲(chǔ)器基類(lèi)映射二進(jìn)制可執(zhí)行程序,提供通用的指令或獲取數(shù)據(jù),以及amoco.main分析類(lèi)強(qiáng)制使用的必需API。 大多數(shù)amoco.system模塊使用此基類(lèi),并根據(jù)專(zhuān)用系統(tǒng)和體系結(jié)構(gòu)(Linux / x86,Win32 / x86等)重新定義initenv(),:meth'load_binary'和幫助器方法。
lsweep:
Linear sweep based analysis: a fast but somehow dumb way of disassembling a program. Other strategies usually inherit from this class which provides generic methods sequence() and iterblocks() as instruction and basic block iterators.
有幾種策略來(lái)構(gòu)建程序的控制流程圖(即其所有功能的CFG),但沒(méi)有一種是完美的。 一些策略在模塊main中實(shí)現(xiàn),范圍從簡(jiǎn)單的main.lsweep線性掃描方法到鏈接反向方法(參見(jiàn)main.lbackward),該方法向后評(píng)估程序的計(jì)數(shù)器,直到獲得具體值或當(dāng)前的根節(jié)點(diǎn) 達(dá)到CFG:
cas/mapper.py
The mapper module essentially implements the mapper class and the associated merge() function which allows to get a symbolic representation of the union of two mappers.
映射器模塊實(shí)質(zhì)上實(shí)現(xiàn)了映射器類(lèi)和相關(guān)的merge()函數(shù),
merge()函數(shù)允許獲得一個(gè)符號(hào),這個(gè)符號(hào)表示兩個(gè)映射器的并集。
class cas.mapper.mapper(instrlist=None, csi=None)
A mapper is a symbolic functional representation of the execution of a set of instructions.
映射器是執(zhí)行了一組指令的符號(hào)功能的表示
Parameters
? instrlist (list[instruction]) – a list of instructions that are symbolically executed within the mapper.
? csi (Optional[object]) – the optional csi attribute that provide a concrete initial state
__map
is an ordered list of mappings of expressions associated with a location (register or memory pointer). The order is relevant only to reflect the order of write-to-memory instructions in case of pointer aliasing.
是與位置(寄存器或內(nèi)存指針)關(guān)聯(lián)的表達(dá)式映射的有序列表。在pointer aliasing的情況下, 該順序僅反映出寫(xiě)入到存儲(chǔ)器指令的順序相關(guān)。
__Mem
is a memory model where symbolic memory pointers are addressing separated memory zones. See MemoryMap and MemoryZone classes.
是一種內(nèi)存模型,其中符號(hào)內(nèi)存指針尋址分開(kāi)的內(nèi)存zone(個(gè)人理解就是偏移)。 請(qǐng)參閱MemoryMap和MemoryZone類(lèi)。
conds
is the list of conditions that must be True for the mapper
csi
is the optional interface to a concrete state
class system.core.MemoryZone(rel=None)
A MemoryZone contains mo objects at addresses that are integer offsets related to a symbolic expression. Adefault zone with related address set to None holds values at concrete addresses in every MemoryMap.
MemoryZone在地址處包含mo對(duì)象,這些對(duì)象是與符號(hào)表達(dá)式相關(guān)的整數(shù)偏移量。 將相關(guān)地址設(shè)置為“無(wú)”的Adefault區(qū)域保存每個(gè)MemoryMap中具體地址的值。
class system.core.MemoryMap(D=None)
Provides a way to represent concrete and abstract symbolic values located in the virtual memory space of a
process. A MemoryMap is organised as a collection of MemoryZone.
提供一種表示位于進(jìn)程虛擬內(nèi)存空間中的具體和抽象符號(hào)值的方法。 MemoryMap被組織為MemoryZone的集合。
linear()
The linear sweep method (main.lsweep) works basically like objdump. It produces instructions by disassembling bytes one after the other, ignoring the effective control flow. For standard x86/x64 binaries, the result is not so bad because code and data are rarely interlaced, but for many other architectures the result is incorrect. Still, it provides - at almost no cost - an over approximation of the set of all basic blocks for architectures with strict fixed-length instruction alignment.
線性掃描方法(main.lsweep)基本上像objdump一樣工作。它通過(guò)一個(gè)接一個(gè)地反匯編字節(jié)來(lái)轉(zhuǎn)化指令,忽略了有效的控制流程。對(duì)于標(biāo)準(zhǔn)的x86 / x64二進(jìn)制文件,結(jié)果并不是很糟糕,因?yàn)榇a和數(shù)據(jù)很少交錯(cuò),但對(duì)于許多其他體系結(jié)構(gòu),結(jié)果是不正確的。盡管如此,它提供了 - 幾乎沒(méi)有成本 - 對(duì)于具有嚴(yán)格固定長(zhǎng)度指令對(duì)齊的架構(gòu)的所有基本塊集合的過(guò)度逼近。
iterblocks(loc=None)
Iterator over basic blocks. The instruction.type attribute is used to detect the end of a block (type_control_flow). The returned block object is enhanced with plateform-specific informations (see block.misc).
迭代基本塊。 instruction.type屬性用于檢測(cè)塊的結(jié)尾(type_control_flow)。 返回的塊對(duì)象使用特定平臺(tái)的信息進(jìn)行增強(qiáng)(請(qǐng)參閱block.misc)。
參考網(wǎng)站:
https://www.cnblogs.com/jiqingwu/p/elf_explore_2.html
官方文檔:https://buildmedia.readthedocs.org/media/pdf/amoco/stable/amoco.pdf (必看)