🏳️
The CTF Recipes
  • Introduction
  • Cryptography
    • Introduction
    • General knowledge
      • Encoding
        • Character encoding
          • ASCII
          • Unicode
          • UTF-8
        • Data encoding
          • Base16
          • Base32
          • Base64
      • Maths
        • Modular arithmetic
          • Greatest Common Divisor
          • Fermat's little theorem
          • Quadratic residues
          • Tonelli-Shanks
          • Chinese Remainder Theorem
          • Modular binomial
      • Padding
        • PKCS#7
    • Misc
      • XOR
    • Mono-alphabetic substitution
      • Index of coincidence
      • frequency analysis
      • Well known algorithms
        • 🔴Scytale
        • 🔴ROT
        • 🔴Polybe
        • 🔴Vigenere
        • 🔴Pigpen cipher
        • 🔴Affine cipher
    • Symmetric Cryptography
      • AES
        • Block Encryption procedure
          • Byte Substitution
          • Shift Row
          • Mix Column
          • Add Key
          • Key Expansion / Key Schedule
        • Mode of Operation
          • ECB
            • Block shuffling
              • Challenge example
            • ECB Oracle
              • Challenge example
          • CBC
            • Bit flipping
              • Challenge example
            • Padding oracle
              • Challenge example
          • OFB
            • Key stream reconstruction
            • Encrypt to Uncrypt
  • 🛠️Pwn
    • General knowledge
      • STACK
        • Variables storage
        • Stack frame
      • PLT and GOT
      • HEAP
        • HEAP operations
        • Chunk
        • Bins
        • Chunk allocation and reallocation
      • Syscall
    • Architectures
      • aarch32
        • Registers
        • Instruction set
        • Calling convention
      • aarch64
        • Registers
        • Instruction set
        • Calling convention
      • mips32
        • Registers
        • Instruction set
        • Calling convention
      • mips64
        • Registers
        • Instruction set
        • Calling convention
      • x86 / x64
        • Registers
        • Instruction set
        • Calling convention
    • Stack exploitation
      • Stack Buffer Overflow
        • Dangerous functions
          • gets
          • memcpy
          • sprintf
          • strcat
          • strcpy
        • Basics
          • Challenge example
        • Instruction pointer Overwrite
          • Challenge example
        • De Bruijn Sequences
        • Stack reading
          • Challenge example
      • Format string
        • Dangerous functions
          • printf
          • fprintf
        • Placeholder
        • Data Leak
          • Challenge example
        • Data modification
          • Challenge example
      • Arbitrary code execution
        • Shellcode
        • ret2reg
        • Code reuse attack
          • Ret2plt
          • Ret2dlresolve
          • GOT Overwrite
          • Ret2LibC
          • Leaking LibC
          • Ret2csu
          • Return Oriented Programming - ROP
          • Sigreturn Oriented Programming - SROP
          • Blind Return Oriented Programming - BROP
            • Challenge example
          • 🔴Call Oriented Programming - COP
          • 🔴Jump Oriented Programming - JOP
          • One gadget
        • Stack pivoting
    • 🛠️Heap exploitation
      • Heap overflow
        • Challenge example
      • Use after free
        • Challenge example
      • 🛠️Double free
      • 🔴Unlink exploit
    • Protections
      • Stack Canaries
      • No eXecute
      • PIE
      • ASLR
      • RELRO
    • Integer overflow
Powered by GitBook
On this page
  • How it works ?
  • Resources
  1. Cryptography
  2. General knowledge
  3. Encoding
  4. Character encoding

UTF-8

PreviousUnicodeNextData encoding

Last updated 2 years ago

UTF-8 (8-bit Unicode Transformation Format) is a variable-width character encoding standard that is widely used for representing the characters of the character set. It was designed as a replacement for and other single-byte character encodings, with the goal of supporting all characters used in the world's writing systems.

How it works ?

In UTF-8, each character is represented by one to four bytes, depending on the complexity of the character.

characters are still represented by a single byte, but other characters require more bytes to represent all of their unique details.

The advantage of UTF-8 is that it is backwards compatible with ASCII and can be used with existing ASCII-based systems, while also supporting a much wider range of characters.

Resources

Unicode
ASCII
ASCII
UTF-8 code pageCharset.org: Character Set Tools, Conversions and Programming
Logo