De Bruijn Sequences
"Calculating" offset
A de Bruijn sequence is a sequence of symbols over a given alphabet that contains every possible substring (of a given length) of the alphabet exactly once as a contiguous block. This makes finding the offset until EIP much simpler - we can just pass in a De Bruijn sequence, get the value within EIP and find the one possible match within the sequence to calculate the offset.
For example, consider an alphabet with 4 symbols:
A de Bruijn sequence of length 2
over this alphabet would be a sequence of symbols that contains every possible two-symbol substring of the alphabet exactly once as a contiguous block.
One possible de Bruijn sequence of length 2 over this alphabet is: 0123010220130123
.
This sequence contains every possible two-symbol substring of the alphabet exactly once, including
Generate sequences
The following command can be used to generate a sequence :
GDB aslo provide a command to generate patterns :
Usage
This type of pattern is mostly used to retrieve the offset between user input and EIP. The entire string will be send as user input, then the program will crash because there is no instruction at any of the possible address (0x61616166 for example) :
Then The offset can be determined by retrieving the sequence into the entire string
It's possible to search a specific sequence into the entire string using the -l
parameter :
It's also possible directly into GDB :
There is a python code that directly use De Bruijn Sequences to retrieve offset between user input and saved instruction pointer :
Last updated