:[diStorm64}: |
| The ultimate disassembler library |
| (for AMD64, X86-64) |
diStorm3's source code is available at - Google Code.
Description:
diStorm64 is a professional quality open source disassembler library for AMD64, licensed under the BSD license.
diStorm is a binary stream disassembler. It's capable of disassembling 80x86 instructions in 64 bits (AMD64, X86-64) and both in 16 and 32 bits.
In addition, it disassembles FPU, MMX, SSE, SSE2, SSE3, SSSE3, SSE4, 3DNow! (w/ extensions), new x86-64 instruction sets, VMX, and AMD's SVM!
diStorm was written to decode quickly every instruction as accurately as possible.
Robust decoding, while taking special care for valid or unused prefixes, is what makes this disassembler powerful, especially for research.
Another benefit that might come in handy is that the module was written as multi-threaded, which means you could disassemble several streams or more simultaneously.
For rapidly use, diStorm is compiled for Python and is easily used in C as well.
diStorm was originally written under Windows and ported later to Linux and Mac. The source code is portable and platform independent (supports both little and big endianity).
It also can be used as a ring0 disassembler (tested as a kernel driver using the DDK under Windows)!
Note that there are currently no known bugs.
Please visit this page periodically in order to get an updated version if available.
The output consists of a few fields:
1)Offset of the disassembled instruction.
2)Size of the disassembled instruction.
3)Hex dump of the disassembled instruction in little-endian format
(separated respectively to instruction elements).
4)Textual reprensentation of the disassembled instruction in Intel format.
More details about the decoding phase:
- Unused/extra prefixes are dropped (output as DB'ed).
- Lock prefix works only on lockable instructions if the first operand is in the form of memory indirection.
- REPn/z prefix works only on repeatable string instructions as well as I/O instructions.
- Segment Override prefixes are possible where memory indirection address is being used (and specially treated with string and I/O instructions).
- Some SSE2 instructions support pseudo opcodes (CMP family).
- Waitable instructions are supported (FINIT etc.).
- "Native" instructions, those which have the same mnemonic in different decoding modes, unless there's an operand size prefix, which then a suffix letter is concatenated to the mnemonic in order to indicate the operation size (instructions like: PUSHA, IRET, RETF, etc.).
- XLAT instruction is treated specially when prefixed.
- Some instructions which have two mnemonics according to the decoding modes are supported.
- Truncates instructions when reaches end of stream.
- Drops invalid instructions when their operands are invalid.
- Won't decode instructions which are longer than 15 bytes.
- CR8 register is now accessible using the Lock prefix in 32 bits decoding mode.
- In 64 bits decoding mode the Segment Override prefixes CS, DS, ES and SS are ignored.
- Segment pushes can be prefixed by operand size prefix.
- Instructions such as: JMP FAR, CALL FAR, CMPXCHG8/16B, SIDT, LIDT, SGDT and others with complex memory indirection types are fully supported with size indication.
- In 64 bits decoding mode ARPL is actually MOVZXD, and when it's prefixed with REX it becomes MOVSXD.
Future Updates:
Code analysis related material.
SSE5.
diStorm Mailing List:
If you wish to get up to date with diStorm releases this is the place.
The list will be used to discuss matters of disassemblers and other tools in the same domain.
Programmers and users should use this list to share information regarding diStorm and its development.
This is in the hope that a growing community will develop code analysis tools and will get better and bigger.
The mailing list is open to the public (so are the archives), the list of members is closed and everyone can subscribe.
If you wish to subscribe to the mailing list, subscribe here.
Downloads (Windows):
Version: diStorm64 1.7.30
- distorm64.zip - C Library, VS7 Sample, [MD5:1b0ac187e4053be4bb95b50e53fc8998] | [2008-09-21]
- distorm.pyd - Python 2.5, [MD5:d556a8777a7558c2671a54fb454d6752] | [2008-09-21]
- distorm.pyd - Python 2.4, [MD5:d0c4c3db41fb4e7ba6204bc370143e98] | [2008-09-21]
- distorm.pyd - Python 2.3, [MD5:f596eb94e6c6b72c1e83aa303ac626f2] | [2008-09-21]
- disasm32-1.7.30.exe - Flat Disassembler Executable (for x86), [MD5:d502976b8800cc809ce9433ed47f1852] | [2008-09-21]
Linux Port:
"diStorm compiles under GCC like a charm!".
Since Linux is mostly source code compatible, download the package.
Please follow the readme's in order to use diStorm properly.
diStorm Package:
The diStorm package contains:
- diStorm64's full source code!
- Win32 solution files for VS03.
- Linux make files.
- Windows disassembler project.
- Windows kernel driver disassembler project.
- Linux disassembler project.
- Project Documentation.
- Various Compilers Support.
Download the package now:
Documentation:
diStorm's documentation is separated into two major parts.
The first part talks about 80x86 low level, and the second part talks about the internals of diStorm.
Both are meant to explain how diStorm was written, decisions I had to make, how it's implemented,
how to compile diStorm and much more invaluable information about diStorm and 80x86 bits.
The first part was written partly as a tutorial so it would benefit all of you.
Click here. [Last updated on: 2006-10-02]
This is the first release of the documentation and it's subject to change.
Interface Documentation:
Python:
list Decode(long offset, string code, int mode)
Input:
offset - Offset of the code in memory(as of origin, virtual address!)
code - Buffer of the binary code
mode - Decode16Bits - 80286 decoding
Decode32Bits - IA-32 decoding
Decode64Bits - AMD64 decoding
Return:
list - List of tuples with the disassembled instructions,
each tuple consists of offset, size, mnemonic and hex strings per instruction
C Library:
Please download the library sample for usage example, you can also read the distorm.h for more info.
I am really sorry for any inconvenience for removing the deprecated linked list interface, the new interface is what should have been from the beginning.
In brief, you supply an array and the disassembler will fill it with the disassembled instructions.
This way, it spares the extra copyings from the linked list to your data structures.
Python Example:
Note: Save the file in %PYTHONDIR%\Lib\site-packages\
from distorm import Decode, Decode16Bits, Decode32Bits, Decode64Bits
l = Decode(0x100, open("file.com", "rb").read(), Decode16Bits)
for i in l:
print "0x%08x (%02x) %-20s %s" % (i[0], i[1], i[3], i[2])
|
diSlib64 (A Python PE Parser):
diSlib is a an easy to use Python module to parse PE executables.
It will give you all necessary information such as:
- AMD64 PE+ files are parsed too [NEW]
- sections with their accompanying information
- imported functions and their addresses (IAT)
- exported functions by name, ordinal and address
- supports ImageBase relocation
- relocated entries by offsets and their original Q/DWORD values.
- lets you apply the relocations
- uses exceptions and OO interface (thanks to Shenberg!)
Download: diSlib64.py
Last Update: 12th Feb 2008
Note: diSlib64 uses diStorm to disassemble entry point routine, you can comment it out in main() function.
Sample output of my disassembler using diSlib64: Output #0
diSlib is a part of a bigger project which eventually turn out to be a high level disassembler (in contrast to diStorm) with code analysis and more.
disOps (Instructions Sets DataBase):
It all began when I designed diStorm, I decided that the instructions data base should be separated from the code which decodes the instructions. Therefore, I needed a tool which will create the instructions tables for the disassembler engine to use. Therefore I wrote an internal tool, formerly known as IGEN which was written in C.
disOps is the instructions tables generator for diStorm.
It is a standalone helper tool rewritten in Python which contains all the instructions that diStorm supports in a data base easily accessible.
It uses its instructions DB for generating the insts.c file for diStorm,
but it can be used for other things as well.
Download: disOps.zip
History:
- 17th, Sep 2008 - Ver 1.7.30 - Fixed DDK project, added new DLL project under Win32 (so you can create distorm.dll and use it dynamically). In addition, added new Python support for diStorm which uses ctypes, thanks to Victor Stinner. [NEW]
- 7th, Mar 2008 - Ver 1.7.29 - LEA now ignores segment overrides.
- 13th, Feb 2008 - Ver 1.7.28 - Fixed NOP/XCHG in 64bits, uploaded disOps too.
- 8th, Dec 2007 - Ver 1.7.27 - Various instructions fixed.
- 31st, Aug 2007 - Ver 1.7.26 - SSE4a instruction set support, and various compilers (DMC, OW, TCC for Windows) support - Thanks to JvW, BTTR, LLee for helping.
- 20th, July 2007 - Ver 1.7.25 - A memory leak in the Python exension module is fixed.
- 3rd, July 2007 - Ver 1.7.24 - New instruction, multi-byte NOP.
- 19th, May 2007 - Ver 1.7.23 - Fixed some SVM instructions.
- 1st, May 2007 - Ver 1.7.22 - SSE4 instructions set was added! Thanks to Saul Tamari for noticing the MMX MOVD/Q instruction's mnemonics nuance.
- 17th, April 2007 - Ver 1.6.21 - Mov-Offset instructions in 64bits weren't decoded properly, thanks to Keith Kanios from NASM for reporting. diStorm now can be compiled under Mac (works for Tiger), thanks to Tim Ebringer for the new /build/mac directory.
- 3rd, March 2007 - Ver 1.6.20 - A bug fix for 3DNow! instructions that weren't decoded (bug produced in 1.6.19).
- 28th, Feb 2007 - Ver 1.6.19 - All string/IO instructions support both REP/REPNZ perfixes. A big change was done to the instructions data structures, sparing around 12kb in data.
- 14th, Feb 2007 - Ver 1.5.18 - A decoding bug in SSE conversion instructions is fixed.
- 15th, Jan 2007 - diStorm contains a new package to compile it using the DDK as a kernel driver for Windows.
- 5th, Jan 2007 - Ver 1.5.17 - Source code version only, what used to be called SSE4 is now officially called SSSE3.
- 2nd, Oct 2006 - Ver 1.5.16 - MOVSXD wasn't properly decoded and now MOVZXD is also supported. In addition, diStorm64 is now compiled to support Python 2.5!
- 28th, Aug 2006 - Secured VM (AMD) instructions were added!
- 10th, July 2006 - A better Linux sample project has been uploaded thanks to Mikhail Teterin. New diStorm in executable form is downloadable for Windows only.
- 1st, July 2006 - Ver 1.5.14 - Sometimes REX prefix was used and dropped at the same time, and String instructions are now promoted to 64bits only with REX prefix. Thanks to Alorent for the bug reports.
- 9th, June 2006 - Ver 1.5.13 - A bug fix for a specific bytes sequence was fixed. Thanks to Peter Fedorow for reproducing this bug on mingw. In addition, there are lots of operands' types fixes which are now more accurate (SGDT, SIDT, LGDT, LIDT, JMP FAR, CALL FAR, SMSW, LMSW, LDS, LES, LFS, LGS, LSS, etc).
- 3rd, June 2006 - Ver 1.5.12 - Invalid memory access occurred in 64bit environments only is now fixed.
- 2nd, June 2006 - Ver 1.5.11 - Fix of a new bug (caused in latest version) of 32bits immediates in 64bits mode.
- 31st, May 2006 - Ver 1.5.10 - The source code was refined and PUSH/POP instructions were corrected, thanks to Sanjay for helping!
- 20th, May 2006 - Ver 1.3.9 - diStorm now supports the VMX and SSE4 instructions sets!
- 12th, May 2006 - Ver 1.2.8 is a source code release. diStorm now supports big endian machines! Thanks to Sanjay who tested it on PowerPC.
- 18th, April 2006 - Ver 1.1.8 is released, LOOPxx instructions are now categorized as Natives. Thanks to 'CPUID' again.
- 20th, March 2006 - Ver 1.1.7 is released, a bug in the way mandatory prefixes were treated in 64 bits was fixed, thanks to Alorent.
- 18th, March 2006 - Ver 1.1.6 is released, MOVZX/MOVSX support both 16bit registers and RETF is now operand size sensitive (66, 48), thanks to 'CPUID'.
- 3rd, February 2006 - Ver 1.1.5 is released, slight fixes in PUSH, JMP and CMP instructions.
- 27th, January 2006 - Ver 1.1.4 is released, diStorm now supports 64bit offsets.
diStorm is now released under the BSD license!
- 17th, January 2006 - Ver 1.1.3 is released. A bug fix for some special instructions which ignored REX.B flag, thanks to Stefan.
- 11th, January 2006 - diStorm source code is released!
- 10th, January 2006 - Documentation is uploaded.
- 24th, December 2005 - diStorm64 1.1.2 is released (hopefuly final version). The linked list interface is deprecated and a new interface is now used to quicken things. This version is highly optimized; and will be opened source next week.
- 24th, December 2005 - The old release of diStorm is now obsolete and removed from the page.
- 18th, November 2005 - diStorm64 is now available for Linux, thanks to Izik for helping.
- 3rd, November 2005 - diStorm64 updated, the Call instruction wasn't promoted to 64 bits, thanks to Stefan for reporting.
- 10th, September 2005 - New 16/32 bits version - fixed bugs that were found in diStorm64 development.
- 6th, September 2005 - Release of diStorm64!
- 19th, July 2005 - Decoding bug for 3.3 (+REG bits) bytes long instructions was fixed.
- 9th, July 2005 - Fix of a memory leak in the C library, thanks to Qages.
- 6th, June 2005 - First release of diStorm.
Where's The Assembler?
No assembler for 80x86 is going to be written.
There are currently plenty of great assemblers out there, but there's a special one, which supports AMD64 already,
please visit the YASM project for more information.
Who's Behind It All?
Gil Dabah (AKA Arkon) started this project in June 2003.
He wrote this project from scratch because of the challenge in decoding all 80x86 instructions and for his own fun. Today, he still runs this project alone in his free time.
In the far future he plans on writing and releasing code analysis tools.
Contact:
If you wish to learn more about this project,
or you think you can help, or maybe you just found a bug,
or just feel like saying anything to me,
send an email, here.
If you want to leave any comments, please use The Forum of RageStorm.
——————————————————————————————————————————————
Copyright © RageStorm 2009, Gil Dabah
|