Home
About
- Rhy7hm
  
  天天被计算机教做人
- Learn More
- Email
Posts
- All Posts
- All Tags
Projects

ELF文件的重定位+ROP寻找gadgets的一个算法

05 Sep 2018

Reading time ~2 minutes

与ret2dlresolve 无关

看一下ELF

参考：《Linux二进制分析》

一个ELF文件可以被标记为：未知类型、重定位文件（目标文件）、可执行文件、共享目标文件和核心文件。其中共享目标文件是一种动态的可连接的目标文件，是一种特殊类型的可重定位目标文件。

strip命令：通过删除可执行文件中ELF头的 typchk段、符号表、字符串表、行号信息、调试段、注解段、重定位信息等来实现缩减程序体积的目的，被剪裁过的可执行文件不可进行还原。

参考：

https://blog.csdn.net/stpeace/article/details/47090255

http://linux.51yip.com/search/strip

一个动态连接的ELF文件被strip命令删除符号表后会保留.dynsym，丢弃.symtab，只能看到导入库的符号。

其中..dynsym是Dynamic Symbol Table，即动态符号表，它只保存了与动态链接相关的符号。.symtab 往往保存了所有符号，包括.dynsym中的。

nm ->读取符号信息

ELF的重定位

R_386_PC32类型：S+A-P

偏移量->虚拟地址

address_of_call + offset + 5

5为调用指令长度

地址->偏移量

address - address_of_call - 4

4为调用指令立即操作数的长度

当目标文件A调用了目标文件B里的一个函数时

如若objA.o调用了objB.o中的foo

e8 fc ff ff ff call 7 <func+0x7>

其中0xfffffffc（-4）是该重定位目标所占空间的大小

在对两者编译后输出的可执行文件里

80480de: e8 05 00 00 00 call 80480e8 <foo>

而 5 = S + A - P = 0x80480e8 + 0xfffffffc - 80480df

即 foo在可执行文件中的地址 + size(地址空间) - call foo时foo所在的地址

所以foo的地址 = call foo时foo所在的地址 - size(地址空间) + 5

即：

S是索引位于重定位条目中的符号的值

A是重定位条目中的加数

P是要进行重定位的存储单元的地址

顺路（？）复习了一下ROP

参考：

http://delivery.acm.org/10.1145/1320000/1315313/p552-shacham.pdf?ip=103.7.29.9&id=1315313&acc=ACTIVE%20SERVICE&key=39FCDE838982416F%2E39FCDE838982416F%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35&acm=1536117299_8bff7915ead70d6f59bb9c1a568f076f

这里翻译了很小的一部分：

https://firmianay.gitbooks.io/ctf-all-in-one/content/doc/8.1_ret2libc_without_calls.html

~~我仍未知道为什么这个ctf-all-in-one密码部分都是空的~~

mark一下寻找gadgets的算法：

Alt text

扫描二进制找到 ret 指令，将其作为根节点，然后回溯解析前面的指令，如果是有效指令，将其添加为子节点，再判断是否为 boring，如果不是，就继续递归回溯。

这里提一下为什么定义

the instruction is a leave instruction and is followed by a ret instruction; or 该指令是 leave，并且后跟一个 ret 指令
the instruction is a pop %ebp instruction and is immediately followed by a ret instruction; or 或者该指令是一个 pop %ebp，并且后跟一个 ret 指令
the instruction is a return or an unconditional jump. 或者该指令是返回或者非条件跳转

为“boring”，不再递归回溯。

The last of these criteria eliminates instruction streams in which control transfers elsewhere before the ret is reached, 最后一个定义里的指令，命令流在到达ret之前，就转去了别的地方。 as these are useless for our purposes. 所以这对我们的目标没用。

The other two are intended to capture, and allow us to ignore, instruction streams that are actually generated by the compiler. 另外两个目的是捕获并允许我们忽略编译器实际生成的指令流。前两个定义的指令旨在捕获编译器实际生成的指令刘，这是允许我们忽略的。

Because the libc we examined was compiled with frame pointer enabled, functions in libc will, by and large, end either with a “leave; ret” sequence or an equivalent where the leave instruction is replaced by mov and pop instructions. 因为我们测试用的的libc是编译时是有栈帧指针的，所以大体上而言，其中的函数都会以“leave;ret”，或者把leave替换为 mov 或 pop的相同的指令结束。

彩蛋：

gcc 的-fomit-frame-pointer 选项->优化掉stack frame pointer(SFP)

可参考：

https://blog.csdn.net/trochiluses/article/details/10495193

It is important to observe that the conditions given here eliminate instruction sequences that would be useful in crafting exploits. There are three ways in which they do so.

很重要的一点是，这里给出的情况下，有一些对写exp有用的指令流会被消除掉。

它们可以通过以下三种方式做到这点。

First,even if we wish to avoid calling actual functions in libc, suffixes of those functions might prove useful and, if short, difficult for the compiler-writer to eliminate.

首先，即使我们希望避免在libc中调用实际的函数，但这些函数的后缀也是有用的，而且，如果它短的话，程序员删掉它也是很困难的

Second, the same characteristics that allow us to discover unintended instruction sequences elsewhere will also allow us to discover, within the body of libc functions, unintended sequences that end in intended “leave; ret” sequences.

其次，让我们发现在其他地方的意料外的命令序列的同样的特性，也会让我们去发现libc函数内的，以意料之中的“leave;ret”结尾的命令序列。

………………………………

彩蛋：

h = h * 33 + c —-> h = ( ( h « 5 ) + h ) + c

ELF Share Tweet +1