棧緩衝區溢出(stack buffer overflow或stack buffer overrun)是電腦程式把數據寫入調用棧上的內存時超出了數據結構的邊界。[1][2]棧緩衝區溢出是緩衝區溢出的一種。[1] 這會損壞相鄰數據的值,引發程序崩潰或者修改了函數返回地址從而導致執行惡意的程序。這種攻擊方式稱為stack smashing。可被用於注入可執行代碼、接管進程的執行。是最為古老的黑客攻擊行為之一。[3][4][5]

例子

下例可用於覆蓋函數返回地址。[3][6] 通過函數 strcpy() :

#include <string.h>

void foo (char *bar)
{
   char  c[12];

   strcpy(c, bar);  // no bounds checking
}

int main (int argc, char **argv)
{
   foo(argv[1]);

   return 0;
}

當命令行參數少於12個字符時(例子B時)該程序是安全的。

foo()函數的不同輸入下的調用棧,下圖展示32位元小端序(little-endian)系統發生棧緩衝區溢出的記憶體狀態:

Thumb
A. - Before data is copied.
Thumb
B. - "hello" is the first command line argument.
Thumb
C. - "A​A​A​A​A​A​A​A​A​A​A​A​A​A​A​A​A​A​A​A​\x08​\x35​\xC0​\x80" is the first command line argument.

例子C中, 命令行參數多於11個字符,導致foo()覆蓋了本地調用棧的數據、存儲的棧指針(EBP)以及最重要的返回地址。

攻擊也可以修改內部變量值:

#include <string.h>
#include <stdio.h>

void foo (char *bar)
{
   float My_Float = 10.5; // Addr = 0x0023FF4C
   char  c[28];           // Addr = 0x0023FF30

   // Will print 10.500000
   printf("My Float value = %f\n", My_Float);

    /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       Memory map:
       @ : c allocated memory
       # : My_Float allocated memory

           *c                      *My_Float
       0x0023FF30                  0x0023FF4C
           |                           |
           @@@@@@@@@@@@@@@@@@@@@@@@@@@@#####
      foo("my string is too long !!!!! XXXXX");

   memcpy will put 0x1010C042 (little endian) in My_Float value.
   ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~*/

   memcpy(c, bar, strlen(bar));  // no bounds checking...

   // Will print 96.031372
   printf("My Float value = %f\n", My_Float);
}

int main (int argc, char **argv)
{
   foo("my string is too long !!!!! \x10\x10\xc0\x42");
   return 0;
}

平台相關

已隱藏部分未翻譯內容,歡迎參與翻譯

A number of platforms have subtle differences in their implementation of the call stack that can affect the way a stack buffer overflow exploit will work. Some machine architectures store the top level return address of the call stack in a register. This means that any overwritten return address will not be used until a later unwinding of the call stack. Another example of a machine specific detail that can affect the choice of exploitation techniques is the fact that most RISC style machine architectures will not allow unaligned access to memory.[7] Combined with a fixed length for machine opcodes this machine limitation can make the jump to ESP technique almost impossible to implement (with the one exception being when the program actually contains the unlikely code to explicitly jump to the stack register).[8][9]

Stacks that grow up

Within the topic of stack buffer overflows, an often discussed but rarely seen architecture is one in which the stack grows in the opposite direction. This change in architecture is frequently suggested as a solution to the stack buffer overflow problem because any overflow of a stack buffer that occurs within the same stack frame can not overwrite the return pointer. Further investigation of this claimed protection finds it to be a naive solution at best. Any overflow that occurs in a buffer from a previous stack frame will still overwrite a return pointer and allow for malicious exploitation of the bug.[10] For instance, in the example above, the return pointer for foo will not be overwritten because the overflow actually occurs within the stack frame for strcpy. However, because the buffer that overflows during the call to strcpy resides in a previous stack frame, the return pointer for strcpy will have a numerically higher memory address than the buffer. This means that instead of the return pointer for foo being overwritten, the return pointer for strcpy will be overwritten. At most this means that growing the stack in the opposite direction will change some details of how stack buffer overflows are exploitable, but it will not reduce significantly the number of exploitable bugs.

保護方式

常用三種方式來對抗棧緩衝區溢出攻擊。

檢查棧緩衝區溢出的發生

棧的警惕標誌(stack canary),得名於煤礦里的金絲雀英語Animal sentinel#Historical examples,用於探測該災難的發生。具體辦法是在棧的返回地址的存儲位置之前放置一個整形值,該值在裝入程序時隨機確定。棧緩衝區攻擊時從低地址向高地址覆蓋棧空間,因此會在覆蓋返回地址之前就覆蓋了警惕標誌。返回返回前會檢查該警惕標誌是否被篡改。[2]

棧數據不可執行

採取了「寫異或執行」策略(W^X英語W^X, "Write XOR Execute"),即內存要麼可寫,要麼可執行,但二者不能兼得。這是最常用的方法,大部分桌面處理器都硬件支持不可執行標誌(no-execute flag)。

已隱藏部分未翻譯內容,歡迎參與翻譯

While this method definitely makes the canonical approach to stack buffer overflow exploitation fail, it is not without its problems. First, it is common to find ways to store shellcode in unprotected memory regions like the heap, and so very little need change in the way of exploitation.[11]

Even if this were not so, there are other ways. The most damning is the so-called return to libc method for shellcode creation. In this attack the malicious payload will load the stack not with shellcode, but with a proper call stack so that execution is vectored to a chain of standard library calls, usually with the effect of disabling memory execute protections and allowing shellcode to run as normal.[12] This works because the execution never actually vectors to the stack itself.

A variant of return-to-libc is return-oriented programming, which sets up a series of return addresses, each of which executes a small sequence of cherry-picked machine instructions within the existing program code or system libraries, sequence which ends with a return. These so-called gadgets each accomplish some simple register manipulation or similar execution before returning, and stringing them together achieves the attacker's ends. It is even possible to use "returnless" return-oriented programming by exploiting instructions or groups of instructions that behave much like a return instruction.[13]

隨機化內存空間佈局

已隱藏部分未翻譯內容,歡迎參與翻譯

Instead of separating the code from the data another mitigation technique is to introduce randomization to the memory space of the executing program. Since the attacker needs to determine where executable code that can be used resides, either an executable payload is provided (with an executable stack) or one is constructed using code reuse such as in ret2libc or ROP (Return Oriented Programming) randomizing the memory layout will as a concept prevent the attacker from knowing where any code is. However implementations typically will not randomize everything, usually the executable itself is loaded at a fixed address and hence even when ASLR (Address Space Layout Randomization) is combined with a nonexecutable stack the attacker can use this fixed region of memory. Therefore all programs should be compiled with PIE (position-independent executables) such that even this region of memory is randomized. The entropy of the randomization is different from implementation to implementation and a low enough entropy can in itself be a problem in terms of brute forcing the memory space that is randomized.

隨機化內存空間佈局(ASLR)會對處理程序記憶體區段地址進行隨機化,阻止攻擊者進行返回導向編程與跳至Shellcode的攻擊手法,因為攻擊者需要知道可執行區段與 Shellcode 的記憶體地址。執行檔如果不是地址無關代碼(PIE)的格式,會導致執行檔載入成處理程序之後,部分區段地址為固定的(如 .text 區段),攻擊者在Linux系統上,依然可以 return 到 0x400000 地址。ASLR受到資訊熵的影響,不同的實作會有不同的資訊熵,過低的資訊熵使攻擊者更容易猜測到地址,也因此32位元比起64位元的更容易受到威脅。也因此在2015年發生Stagefright漏洞,在32位元系統下,ASLR僅有8位元的資訊熵,因此每次漏洞利用,只需要從256種組合猜測可能的記憶體地址進行利用[14]

著名例子

參見

參考文獻

Wikiwand in your browser!

Seamless Wikipedia browsing. On steroids.

Every time you click a link to Wikipedia, Wiktionary or Wikiquote in your browser's search results, it will show the modern Wikiwand interface.

Wikiwand extension is a five stars, simple, with minimum permission required to keep your browsing private, safe and transparent.