首页
社区
课程
招聘
[原创]利用python+pefile库做PE格式文件的快速开发
发表于: 2009-5-25 16:26 42639

[原创]利用python+pefile库做PE格式文件的快速开发

2009-5-25 16:26
42639
发现很多的朋友经常用到PE格式相关的开发,如解析PE文件的格式,获取相关的内容。比如常常用到的静态的病毒启发式检测模型的建立、病毒样本分类、查壳脱壳等。
搜索了一下发现论坛里面没有我要讲的这个东西,于是我在这里向大家推荐pefile这个python库。

这个是基于MIT licence的一个开源项目,你可以在上面做更多的开发。
开发包的下载地址
http://code.google.com/p/pefile/

我觉得有以下几点大家可以注意:
1. 这个需要使用python语言开发,优点是敏捷开发,方便快捷,而且源代码可读,易懂,当然肯定不会用于商业的,作为学习研究非常方便。
2. 由于基于PE的结构pefile已经做了非常充分的解析,所以对于我们做二次开发非常方便。各种关键的数据结构能够非常容易的获得。
3. 由于python的编写的快速、低门槛。另外pefile已经做了很多的功能,这个pefile模块非常适合需要快速达到目的和一些需要入门的朋友。
4. 免费的开源项目

话不多说,直接教大家使用,看完后,方可知道pefile的强大。

1. 当然是要安装python开发包。
2. 下载pefile到本地,解压,新建一个文件petest.py


首先实验一

实验一
import os, string, shutil,re
import pefile ##记得import pefile

PEfile_Path = r"C:\temp\test.exe"

pe = pefile.PE(PEfile_Path)
print PEfile_Path
print pe

实验一结果
C:\temp\test.exe
----------DOS_HEADER----------

[IMAGE_DOS_HEADER]
e_magic:                       0x5A4D    
e_cblp:                        0x90      
e_cp:                          0x3       
e_crlc:                        0x0       
e_cparhdr:                     0x4       
e_minalloc:                    0x0       
e_maxalloc:                    0xFFFF    
e_ss:                          0x0       
e_sp:                          0xB8      
e_csum:                        0x0       
e_ip:                          0x0       
e_cs:                          0x0       
e_lfarlc:                      0x40      
e_ovno:                        0x0       
e_res:                         
e_oemid:                       0x0       
e_oeminfo:                     0x0       
e_res2:                        
e_lfanew:                      0xD0      

----------NT_HEADERS----------

[IMAGE_NT_HEADERS]
Signature:                     0x4550    

----------FILE_HEADER----------

[IMAGE_FILE_HEADER]
Machine:                       0x14C     
NumberOfSections:              0x2       
TimeDateStamp:                 0x46A8C07C [Thu Jul 26 15:40:44 2007 UTC]
PointerToSymbolTable:          0x0       
NumberOfSymbols:               0x0       
SizeOfOptionalHeader:          0xE0      
Characteristics:               0x10F     
Flags: IMAGE_FILE_LOCAL_SYMS_STRIPPED, IMAGE_FILE_32BIT_MACHINE, IMAGE_FILE_EXECUTABLE_IMAGE, IMAGE_FILE_LINE_NUMS_STRIPPED, IMAGE_FILE_RELOCS_STRIPPED

----------OPTIONAL_HEADER----------

[IMAGE_OPTIONAL_HEADER]
Magic:                         0x10B     
MajorLinkerVersion:            0x6       
MinorLinkerVersion:            0x0       
SizeOfCode:                    0x420     
SizeOfInitializedData:         0x130     
SizeOfUninitializedData:       0x0       
AddressOfEntryPoint:           0x522     
BaseOfCode:                    0x220     
BaseOfData:                    0x640     
ImageBase:                     0x400000  
SectionAlignment:              0x10      
FileAlignment:                 0x10      
MajorOperatingSystemVersion:   0x4       
MinorOperatingSystemVersion:   0x0       
MajorImageVersion:             0x0       
MinorImageVersion:             0x0       
MajorSubsystemVersion:         0x4       
MinorSubsystemVersion:         0x0       
Reserved1:                     0x0       
SizeOfImage:                   0x768     
SizeOfHeaders:                 0x420     
CheckSum:                      0x0       
Subsystem:                     0x2       
DllCharacteristics:            0x0       
SizeOfStackReserve:            0x100000  
SizeOfStackCommit:             0x1000    
SizeOfHeapReserve:             0x100000  
SizeOfHeapCommit:              0x1000    
LoaderFlags:                   0x0       
NumberOfRvaAndSizes:           0x10      
DllCharacteristics: 

----------PE Sections----------

[IMAGE_SECTION_HEADER]
Name:                          .text
Misc:                          0x418     
Misc_PhysicalAddress:          0x418     
Misc_VirtualSize:              0x418     
VirtualAddress:                0x220     
SizeOfRawData:                 0x420     
PointerToRawData:              0x420     
PointerToRelocations:          0x0       
PointerToLinenumbers:          0x0       
NumberOfRelocations:           0x0       
NumberOfLinenumbers:           0x0       
Characteristics:               0x60000020
Flags: IMAGE_SCN_CNT_CODE, IMAGE_SCN_MEM_EXECUTE, IMAGE_SCN_MEM_READ
Entropy: 6.385628 (Min=0.0, Max=8.0)
MD5     hash: 37ae973124ba5655ce156536f4018759
SHA-1   hash: 6354d772105b66ac33fb8950b76a289edafa230f
SHA-256 hash: f6dfe337c6c6278e60a687552d8fc3be2a2ed41a4278713cfd0dc631296befdc
SHA-512 hash: 9d22cdd011d7276f47e3b1844804d58be2e73eef826ad285769d449f03dbfcde743303b31a9172e513be571432b7b2080afe571e5819ec7968acd76c0d82207a

[IMAGE_SECTION_HEADER]
Name:                          .rsrc
Misc:                          0x128     
Misc_PhysicalAddress:          0x128     
Misc_VirtualSize:              0x128     
VirtualAddress:                0x640     
SizeOfRawData:                 0x130     
PointerToRawData:              0x840     
PointerToRelocations:          0x0       
PointerToLinenumbers:          0x0       
NumberOfRelocations:           0x0       
NumberOfLinenumbers:           0x0       
Characteristics:               0x40000040
Flags: IMAGE_SCN_CNT_INITIALIZED_DATA, IMAGE_SCN_MEM_READ
Entropy: 2.905524 (Min=0.0, Max=8.0)
MD5     hash: cfd4f1a98445485c616ea2ff9390278e
SHA-1   hash: 7480ffe5427a540e17353df9c490dbba86fd0c3b
SHA-256 hash: 93f9ad56e464614b6aa9521f2b80f3f7f2fd5e2b6d8d6fd6489a0b1cdb1f948e
SHA-512 hash: b054ba77825a4bb92d9beecb606d04f7a4bf4d16529d909e03e6b882175e23fb495c1c3dc9d921c3124210a6567bf68e70879d3163ece1a1cbb786f3ec94af43

----------Directories----------

[IMAGE_DIRECTORY_ENTRY_EXPORT]
VirtualAddress:                0x0       
Size:                          0x0       
[IMAGE_DIRECTORY_ENTRY_IMPORT]
VirtualAddress:                0x574     
Size:                          0x3C      
[IMAGE_DIRECTORY_ENTRY_RESOURCE]
VirtualAddress:                0x640     
Size:                          0x128     
[IMAGE_DIRECTORY_ENTRY_EXCEPTION]
VirtualAddress:                0x0       
Size:                          0x0       
[IMAGE_DIRECTORY_ENTRY_SECURITY]
VirtualAddress:                0x0       
Size:                          0x0       
[IMAGE_DIRECTORY_ENTRY_BASERELOC]
VirtualAddress:                0x0       
Size:                          0x0       
[IMAGE_DIRECTORY_ENTRY_DEBUG]
VirtualAddress:                0x0       
Size:                          0x0       
[IMAGE_DIRECTORY_ENTRY_COPYRIGHT]
VirtualAddress:                0x0       
Size:                          0x0       
[IMAGE_DIRECTORY_ENTRY_GLOBALPTR]
VirtualAddress:                0x0       
Size:                          0x0       
[IMAGE_DIRECTORY_ENTRY_TLS]
VirtualAddress:                0x0       
Size:                          0x0       
[IMAGE_DIRECTORY_ENTRY_LOAD_CONFIG]
VirtualAddress:                0x0       
Size:                          0x0       
[IMAGE_DIRECTORY_ENTRY_BOUND_IMPORT]
VirtualAddress:                0x0       
Size:                          0x0       
[IMAGE_DIRECTORY_ENTRY_IAT]
VirtualAddress:                0x220     
Size:                          0x1C      
[IMAGE_DIRECTORY_ENTRY_DELAY_IMPORT]
VirtualAddress:                0x0       
Size:                          0x0       
[IMAGE_DIRECTORY_ENTRY_COM_DESCRIPTOR]
VirtualAddress:                0x0       
Size:                          0x0       
[IMAGE_DIRECTORY_ENTRY_RESERVED]
VirtualAddress:                0x0       
Size:                          0x0       

----------Imported symbols----------

[IMAGE_IMPORT_DESCRIPTOR]
OriginalFirstThunk:            0x5B0     
Characteristics:               0x5B0     
TimeDateStamp:                 0x0        [Thu Jan 01 00:00:00 1970 UTC]
ForwarderChain:                0x0       
Name:                          0x5E0     
FirstThunk:                    0x220     

KERNEL32.dll.GetModuleHandleA Hint[294]

[IMAGE_IMPORT_DESCRIPTOR]
OriginalFirstThunk:            0x5B8     
Characteristics:               0x5B8     
TimeDateStamp:                 0x0        [Thu Jan 01 00:00:00 1970 UTC]
ForwarderChain:                0x0       
Name:                          0x62C     
FirstThunk:                    0x228     

USER32.dll.EndDialog Hint[185]
USER32.dll.GetDlgItemTextA Hint[260]
USER32.dll.DialogBoxParamA Hint[147]
USER32.dll.MessageBoxA Hint[446]

----------Resource directory----------

[IMAGE_RESOURCE_DIRECTORY]
Characteristics:               0x0       
TimeDateStamp:                 0x0        [Thu Jan 01 00:00:00 1970 UTC]
MajorVersion:                  0x0       
MinorVersion:                  0x0       
NumberOfNamedEntries:          0x0       
NumberOfIdEntries:             0x1       
  Id: [0x5] (RT_DIALOG)
  [IMAGE_RESOURCE_DIRECTORY_ENTRY]
  Name:                          0x5       
  OffsetToData:                  0x80000018
    [IMAGE_RESOURCE_DIRECTORY]
    Characteristics:               0x0       
    TimeDateStamp:                 0x0        [Thu Jan 01 00:00:00 1970 UTC]
    MajorVersion:                  0x0       
    MinorVersion:                  0x0       
    NumberOfNamedEntries:          0x0       
    NumberOfIdEntries:             0x1       
      Id: [0x65]
      [IMAGE_RESOURCE_DIRECTORY_ENTRY]
      Name:                          0x65      
      OffsetToData:                  0x80000030
        [IMAGE_RESOURCE_DIRECTORY]
        Characteristics:               0x0       
        TimeDateStamp:                 0x0        [Thu Jan 01 00:00:00 1970 UTC]
        MajorVersion:                  0x0       
        MinorVersion:                  0x0       
        NumberOfNamedEntries:          0x0       
        NumberOfIdEntries:             0x1       
          [IMAGE_RESOURCE_DIRECTORY_ENTRY]
          Name:                          0x804     
          OffsetToData:                  0x48      
            [IMAGE_RESOURCE_DATA_ENTRY]
            OffsetToData:                  0x6A0     
            Size:                          0xC8      
            CodePage:                      0x0       
            Reserved:                      0x0       

实验一只是做了简简单单的print,但是可以看出pefile对test.exe做了全面的解析从DOS_Header 到 OPTIONAL_HEADER 再到PE SECTIONS。每个结构都可以完全的取得。细心的朋友还可以发现,他甚至可以做对一个section header的hash运算,包括md5, sha1, sha-256, sha-512,对导入导出函数也做了列举。
当然大家会问,未必我们就直接一个print就行了,然后做字符串解析,匹配来获得我们想要的信息?那pefile肯定不至于那么愚昧,当然要提供更多的接口。比如得到entrypoint
print hex(pe.OPTIONAL_HEADER.AddressOfEntryPoint)


实验二
实验二-节表
import os, string, shutil,re
import pefile ##记得import pefile

PEfile_Path = r"C:\temp\test.exe"

pe = pefile.PE(PEfile_Path)
print PEfile_Path

for section in pe.sections:
    print section


实验二结果
C:\temp\test.exe
[IMAGE_SECTION_HEADER]
Name:                          .text
Misc:                          0x418     
Misc_PhysicalAddress:          0x418     
Misc_VirtualSize:              0x418     
VirtualAddress:                0x220     
SizeOfRawData:                 0x420     
PointerToRawData:              0x420     
PointerToRelocations:          0x0       
PointerToLinenumbers:          0x0       
NumberOfRelocations:           0x0       
NumberOfLinenumbers:           0x0       
Characteristics:               0x60000020
[IMAGE_SECTION_HEADER]
Name:                          .rsrc
Misc:                          0x128     
Misc_PhysicalAddress:          0x128     
Misc_VirtualSize:              0x128     
VirtualAddress:                0x640     
SizeOfRawData:                 0x130     
PointerToRawData:              0x840     
PointerToRelocations:          0x0       
PointerToLinenumbers:          0x0       
NumberOfRelocations:           0x0       
NumberOfLinenumbers:           0x0       
Characteristics:               0x40000040

可以看出此文件有2个节.text 和 .rsrc,并且给出了节的相关信息。当然如果你需要获得某一节的具体的某个信息如Characteristics,可以采用
print hex(pe.sections[i].Characteristics)


实验三
实验三-导入表
import os, string, shutil,re
import pefile ##记得import pefile

PEfile_Path = r"C:\temp\test.exe"

pe = pefile.PE(PEfile_Path)
print PEfile_Path

for importeddll in pe.DIRECTORY_ENTRY_IMPORT:
    print importeddll.dll
    ##or use
    #print pe.DIRECTORY_ENTRY_IMPORT[0].dll
    for importedapi in importeddll.imports:
        print importedapi.name
    ##or use
    #print pe.DIRECTORY_ENTRY_IMPORT[0].imports[0].name


实验三-结果
C:\temp\test.exe
KERNEL32.dll
GetModuleHandleA
USER32.dll
EndDialog
GetDlgItemTextA
DialogBoxParamA
MessageBoxA

实验三得出test.exe导入了kernel32.dll和user32.dll然后分别导入了1个和4个API函数。

关于pefile的使用和他的强大功能想必大家也是有所体会,他还有很多的其他功能,比如修改PE结构,另外导入PEiD的特征库就可以支持查壳等等。大家可以试着用一下。

希望这个pefile和强大功能和python的简单易用能帮助到大家。

[课程]FART 脱壳王!加量不加价!FART作者讲授!

收藏
免费 7
支持
分享
最新回复 (21)
雪    币: 339
活跃值: (29)
能力值: ( LV6,RANK:90 )
在线值:
发帖
回帖
粉丝
2
占楼,希望能在斑竹那混个现金啥的。
2009-5-25 16:46
0
雪    币: 93944
活跃值: (200204)
能力值: (RANK:10 )
在线值:
发帖
回帖
粉丝
3
Support.
2009-5-25 16:48
0
雪    币: 846
活跃值: (221)
能力值: (RANK:570 )
在线值:
发帖
回帖
粉丝
4
python搞得我很崩溃,虽说开源代码支持多,可是版本变换太XXX,向下兼容太XXX
2009-5-25 22:53
0
雪    币: 93944
活跃值: (200204)
能力值: (RANK:10 )
在线值:
发帖
回帖
粉丝
5
开源代码 是发展的方向.
2009-5-25 23:02
0
雪    币: 7906
活跃值: (3086)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
6
收藏,以后能看懂再说
2009-5-26 08:06
0
雪    币: 220
活跃值: (10)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
7
修正楼主一个小错误,pefile不是MIT的开源项目,而是代码许可基于MIT License,它是Ero Carrera搞的。
pefile是个好东西。
2009-7-19 17:18
0
雪    币: 433
活跃值: (1875)
能力值: ( LV17,RANK:1820 )
在线值:
发帖
回帖
粉丝
8
打坐学习了…
2009-7-19 20:13
0
雪    币: 204
活跃值: (10)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
9
论坛中源于python 的还是较少的
2009-7-21 11:38
0
雪    币: 262
活跃值: (10)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
10
关注一下。
2009-7-21 23:14
0
雪    币: 258
活跃值: (40)
能力值: ( LV4,RANK:50 )
在线值:
发帖
回帖
粉丝
11
python写的程序就是简单易懂。。
2009-8-24 15:34
0
雪    币: 239
活跃值: (10)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
12
好东西! 啊有python+elfFILE库?
2009-8-24 22:26
0
雪    币: 29
活跃值: (10)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
13
谢谢,相当不错。
2009-9-1 15:09
0
雪    币: 181
活跃值: (27)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
14
很好的东西~
2010-9-25 10:44
0
雪    币: 26
活跃值: (10)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
15
想搞 没时间 可以用用
2011-4-3 09:10
0
雪    币: 180
活跃值: (10)
能力值: ( LV3,RANK:30 )
在线值:
发帖
回帖
粉丝
16
谢谢楼主分享
2011-6-16 13:48
0
雪    币: 261
活跃值: (55)
能力值: ( LV6,RANK:90 )
在线值:
发帖
回帖
粉丝
17
支持下...
2011-6-16 14:40
0
雪    币: 200
活跃值: (10)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
18
这个太强大了。
搞个学习一下。
2011-9-23 14:34
0
雪    币: 1491
活跃值: (975)
能力值: (RANK:860 )
在线值:
发帖
回帖
粉丝
19
    mype = pefile.PE(pefilepath)
    # print pe # parse_version_information('MajorLinkerVersion')
    if hasattr(mype, 'VS_VERSIONINFO'):
      if hasattr(mype, 'FileInfo'):
        for entry in mype.FileInfo:
          if hasattr(entry, 'StringTable'):
            for st in entry.StringTable:
              for k, v in st.entries.items():
                  print k
                  print v
2015-12-14 23:54
0
雪    币: 986
活跃值: (10)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
20
nice work!!
2015-12-24 17:20
0
雪    币: 9
活跃值: (165)
能力值: ( LV6,RANK:90 )
在线值:
发帖
回帖
粉丝
21
强大
2017-5-14 09:29
0
雪    币: 201
活跃值: (178)
能力值: ( LV2,RANK:10 )
在线值:
发帖
回帖
粉丝
22
if hasattr(mype, 'VS_VERSIONINFO'):
  if hasattr(mype, 'FileInfo'):
    for entry in mype.FileInfo:
      if hasattr(entry, 'StringTable'):
        for st in entry.StringTable:
          for k, v in st.entries.items():
              print k
              print v
这个有个小BUG,应该如下:
 if hasattr(peInfo, 'VS_VERSIONINFO'):
        if hasattr(peInfo, 'FileInfo'):
            print('count %d' %peInfo.FileInfo.__len__());
            for entry in peInfo.FileInfo:
                    for list1 in entry:
                        if hasattr(list1, 'StringTable'):
                            for st in list1.StringTable:
                                for k, v in st.entries.items():
                                    print(str(k),':',str(v.decode('utf-8)')))
2020-2-12 13:21
1
游客
登录 | 注册 方可回帖
返回
//