Deep Dives β Bit Manipulation: The Complete Picture
Difficulty: Intermediate
Estimated reading time: ~90 minutes | Exercises: ~2 hours
π Table of Contents
Foundations
- The Four Bitwise Logic Operators
- Shift Operators: SHL, SHR, SAR
- Byte-Level Operations
- Operator Precedence & Gotchas
Twoβs Complement
- How Negative Numbers Work
- Why Addition Just Works
- Sign Extension
- The SIGNEXTEND Opcode
- SAR Revisited
Masking Patterns
Bitmap Sets
- uint256 as 256 Booleans
- Core Operations
- Set Algebra
- Population Count
- Iterating Over Set Bits
- Multi-Word Bitmaps
LSB & MSB Isolation
- Isolate the Lowest Set Bit
- Clear the Lowest Set Bit
- Count Trailing Zeros
- Find the Highest Set Bit
- Count Leading Zeros
Power-of-2 Tricks
Common Bit Idioms
- Branchless Conditional Select
- Branchless Min / Max
- Branchless Absolute Value
- Swap Without Temporary
- Dirty Bits & Cleaning
- Checking Multiple Conditions at Once
- Build Exercise: BitToolkit
π‘ Foundations β Bitwise Operators
Bit manipulation is the art of working with individual bits inside a number. In the EVM, every value is a 256-bit word β thatβs 256 individual switches you can flip, test, and combine. Mastering these operations unlocks the most gas-efficient patterns in Solidity and is essential for reading production assembly.
This section covers the building blocks. Everything else in this deep dive is a combination of these primitives.
π‘ Concept: The Four Bitwise Logic Operators
Why this matters: Every masking operation, every packed struct, every bitmap flag β they all reduce to these four operations. If you understand these cold, the rest of bit manipulation is just recognizing patterns.
The four logic operators work bit by bit β each bit in the result depends only on the corresponding bits in the inputs.
AND (&)
Rule: The result bit is 1 only when both input bits are 1.
Truth table: Example (8-bit):
A B A&B
0 0 0 0b_1100_1010
0 1 0 & 0b_1111_0000
1 0 0 βββββββββββββ
1 1 1 0b_1100_0000
Mental model: AND is a filter. The mask says which bits to keep β 1 means βkeep this bit,β 0 means βforce it to zero.β This is why AND is the core of every extraction pattern.
// Keep only the lower 8 bits of a value
uint256 lower8 = value & 0xFF; // 0xFF = 0b_1111_1111
// Check if bit 5 is set
bool isSet = (value & (1 << 5)) != 0;
OR (|)
Rule: The result bit is 1 when at least one input bit is 1.
Truth table: Example (8-bit):
A B A|B
0 0 0 0b_1100_0000
0 1 1 | 0b_0000_1010
1 0 1 βββββββββββββ
1 1 1 0b_1100_1010
Mental model: OR is a combiner. It merges bits together without destroying existing ones. This is why OR is used to pack fields β you position each field, then OR them all together.
// Set bit 3 (without touching other bits)
value = value | (1 << 3);
// Combine two non-overlapping fields
uint256 packed = (high << 128) | low;
XOR (^)
Rule: The result bit is 1 when the inputs are different.
Truth table: Example (8-bit):
A B A^B
0 0 0 0b_1100_1010
0 1 1 ^ 0b_1111_0000
1 0 1 βββββββββββββ
1 1 0 0b_0011_1010
Mental model: XOR is a conditional flipper. The mask says which bits to toggle β 1 means βflip this bit,β 0 means βleave it alone.β
Key property β self-inverse: a ^ b ^ b == a. XOR-ing the same value twice cancels out. This is the basis for XOR swap, branchless select, and many cryptographic primitives.
// Toggle bit 5
value = value ^ (1 << 5);
// Toggle it again β back to original
value = value ^ (1 << 5);
NOT (~)
Rule: Flips every bit. 0 becomes 1, 1 becomes 0.
Example (8-bit): 256-bit:
~ 0b_1100_1010 ~ 0x00000000...0000FFFF
βββββββββββββ ββββββββββββββββββββββββ
0b_0011_0101 0xFFFFFFFF...FFFF0000
Mental model: NOT creates the inverse mask. If a mask selects certain bits, ~mask selects everything else. This is essential for the βclear a fieldβ pattern: value & ~mask zeros out exactly the bits the mask covers.
Relationship to twoβs complement: In the EVM, ~x == -x - 1 (or equivalently, -x == ~x + 1). This isnβt a coincidence β itβs how twoβs complement works. Weβll explore this in the next section.
// Clear the lower 16 bits
value = value & ~uint256(0xFFFF);
// Invert a selection mask
uint256 keepOthers = ~fieldMask;
π‘ Concept: Shift Operators β SHL, SHR, SAR
Why this matters: Shifts are how you position bits. Every packing operation uses a shift to move a value to its slot. Every extraction uses a shift to move a field back to position 0. And in the EVM, shifts are the cheapest way to multiply or divide by powers of 2.
SHL β Shift Left (<<)
Moves all bits toward the most significant end. Vacated positions on the right fill with zeros. Bits that shift past the top are lost.
Before: 0b_0000_1011 (decimal 11)
SHL 3: 0b_0101_1000 (decimal 88)
^^^ β these zeros filled in
^^^ β these bits moved left 3 positions
Equivalent: 11 Γ 2Β³ = 11 Γ 8 = 88
In the EVM (256-bit), shifting left by n is equivalent to multiplying by 2^n β but at 3 gas (SHL) instead of 5 gas (MUL).
// Position a value into the upper 128 bits of a uint256
uint256 positioned = value << 128;
// Multiply by 8
uint256 result = x << 3; // x * 2Β³
SHR β Shift Right (unsigned) (>>)
Moves all bits toward the least significant end. Vacated positions on the left fill with zeros. Bits that shift past the bottom are lost.
Before: 0b_1011_0100 (decimal 180)
SHR 3: 0b_0001_0110 (decimal 22)
^^^ β these zeros filled in
^^^ β these bits fell off (lost)
Equivalent: 180 / 2Β³ = 180 / 8 = 22 (integer division, remainder lost)
Critical: SHR always fills with zeros, regardless of the original sign. This is correct for unsigned values but wrong for signed values β see SAR below.
// Extract the upper 128 bits of a uint256
uint256 upper = packed >> 128;
// Divide by 4
uint256 result = x >> 2; // x / 2Β²
SAR β Shift Arithmetic Right
Moves bits right, but fills vacated positions with the sign bit (bit 255) instead of zero.
Positive number (sign bit = 0):
Before: 0b_0110_0000 (decimal 96, sign bit 0)
SAR 2: 0b_0001_1000 (decimal 24)
^^ β filled with 0 (sign bit)
Same result as SHR β no difference for positive numbers.
Negative number (sign bit = 1):
Before: 0xFF...FA (-6 in two's complement, sign bit 1)
SAR 1: 0xFF...FD (-3)
^ β filled with 1 (sign bit)
With SHR instead:
Before: 0xFF...FA (-6 in two's complement)
SHR 1: 0x7F...FD (huge positive number!)
^ β filled with 0 (WRONG for signed!)
The rule: Use SHR for uint values, SAR for int values. Using the wrong one silently produces garbage.
In Solidity, the compiler picks the right shift automatically based on the type:
uint256 u = 100;
int256 s = -100;
u >> 1; // Compiles to SHR β 50
s >> 1; // Compiles to SAR β -50
In assembly, you must choose explicitly:
let unsigned_result := shr(1, u) // Zero-fill
let signed_result := sar(1, s) // Sign-fill
All Three Shifts β Side by Side
Input: 0b_1110_0100 (8-bit example)
SHL 2: 0b_1001_0000 β bits move left, zeros fill right
SHR 2: 0b_0011_1001 β bits move right, zeros fill left
SAR 2: 0b_1111_1001 β bits move right, sign bit fills left
(sign bit was 1, so 1s fill in)
π‘ Concept: Byte-Level Operations
Why this matters: The EVM is a 256-bit machine, but many real-world values are byte-sized (addresses are 20 bytes, selectors are 4 bytes, flags are 1 byte). The BYTE opcode and byte-oriented shift patterns let you work at byte granularity within a 256-bit word.
The BYTE Opcode
byte(i, x) extracts the i-th byte from the 256-bit word x, returning it as a uint256 (right-aligned).
Critical: Byte indexing is big-endian β byte 0 is the most significant byte (leftmost), byte 31 is the least significant (rightmost).
256-bit word:
ββββββββ¬βββββββ¬βββββββ¬ββββββ¬βββββββ¬βββββββ
βbyte 0βbyte 1βbyte 2β ... βbyte30βbyte31β
ββββββββ΄βββββββ΄βββββββ΄ββββββ΄βββββββ΄βββββββ
MSB LSB
byte(0, 0xAB000000...00) β 0xAB (most significant byte)
byte(31, 0x00000000...CD) β 0xCD (least significant byte)
BYTE vs Shift+Mask
You can achieve the same result with shifts and AND:
// These are equivalent:
let b := byte(i, x)
let b := and(shr(mul(sub(31, i), 8), x), 0xFF)
The shift version calculates: βto get byte i, shift right by (31 - i) * 8 bits, then mask to 8 bits.β The subtraction from 31 accounts for big-endian ordering.
When to use which:
byte(i, x)β when you know the byte index at compile time, cleaner to read- Shift+mask β when you need to extract multi-byte fields (2, 4, or 20 bytes), BYTE can only do one byte at a time
// Extract bytes 0-3 (a 4-byte selector from the MSB end)
let selector := shr(224, x) // shift right 224 bits = 28 bytes
// Extract bytes 12-31 (a 20-byte address from the LSB end)
let addr := and(x, 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF)
Big-Endian Byte Order
The EVM stores values big-endian: the most significant byte is at the lowest address. This matters when youβre reading from memory or calldata:
Value: 0x0000...00000123 (uint256 = 291)
In memory (32 bytes):
Address: 0x00 0x01 0x02 ... 0x1D 0x1E 0x1F
Content: 0x00 0x00 0x00 ... 0x00 0x01 0x23
^^^^^^^^^^^^^^ ^^^^^^^^^^^
leading zeros actual value (right-aligned)
Value types like uint8, address, bool are right-aligned when stored as 256-bit words in the EVM β padded with zeros on the left.
The exception: bytesN types (not bytes memory, but fixed bytes1 through bytes32) are left-aligned β padded with zeros on the right:
bytes4(0xDEADBEEF) as a 256-bit word:
0xDEADBEEF00000000000000000000000000000000000000000000000000000000
^^^^^^^^^^ value ^^^^^^ padding
This is why extracting a function selector requires shr(224, calldataload(0)) β the 4-byte selector sits in the leftmost bytes, and you shift it right to position 0.
π‘ Concept: Operator Precedence & Gotchas
Solidity Precedence Trap
Solidityβs operator precedence can surprise you. Bitwise operators have lower precedence than comparison operators:
// β οΈ WRONG β this compiles but doesn't do what you think:
if (value & 0xFF == 0) { ... }
// Parses as: value & (0xFF == 0) β value & false β value & 0
// β
CORRECT β explicit parentheses:
if ((value & 0xFF) == 0) { ... }
The rule: Always parenthesize bitwise expressions when combined with comparisons. The compiler wonβt warn you β the wrong version is syntactically valid.
Full precedence (highest to lowest, for the operators we care about):
~(NOT), unary-**(exponentiation)*,/,%+,-<<,>>(shifts)<,>,<=,>=(comparisons)==,!=&(AND)^(XOR)|(OR)
Notice: shifts come before comparisons, but comparisons come before all bitwise operators. This means a << 2 | b is (a << 2) | b (correct), but a & b == c is a & (b == c) (almost certainly wrong).
Assembly Has No Precedence
In Yul/inline assembly, there is no operator precedence β everything is a function call:
// No ambiguity β explicit nesting
let result := and(shr(128, packed), 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF)
This is one reason assembly code, despite being harder to read at first, avoids an entire class of precedence bugs.
π‘ Twoβs Complement
Every signed integer in the EVM uses twoβs complement representation. If youβve ever wondered why type(int256).min is -2^255 but type(int256).max is only 2^255 - 1, or why -1 in twoβs complement (all ones) equals type(uint256).max, this section explains the underlying mechanics.
π‘ Concept: How Negative Numbers Work
Why this matters: Signed integers appear throughout Solidity β int256 for price deltas, int128 for balance changes, int24 for tick values. Every time you pack, unpack, shift, or compare a signed value, twoβs complement rules apply. Misunderstand them and your math silently breaks.
The problem: A bit can only be 0 or 1. How do you represent negative numbers?
The solution: Reserve the top bit as a sign indicator. If bit 255 is 0, the number is non-negative. If bit 255 is 1, the number is negative. But itβs not as simple as βa sign flagβ β the actual encoding uses twoβs complement.
The Encoding Rule
To negate a number: flip all bits, then add 1.
Letβs trace through with 8 bits (the same logic applies to 256 bits):
Step 1: Start with 5
0000_0101 (decimal 5)
Step 2: Flip all bits (NOT)
1111_1010 (this is "one's complement" of 5)
Step 3: Add 1
1111_1011 (this is -5 in two's complement)
Letβs verify: 5 + (-5) should equal 0:
0000_0101 ( 5)
+ 1111_1011 (-5)
βββββββββββ
1_0000_0000 (the 1 overflows out of 8 bits β result is 0) β
The Number Line (8-bit)
Binary: 0000_0000 ... 0111_1111 1000_0000 ... 1111_1111
Unsigned: 0 ... 127 128 ... 255
Signed: 0 ... +127 -128 ... -1
β β
max pos min neg (the "extra" negative)
Key observations:
- Positive range:
0to2^(n-1) - 1(0 to 127 for 8-bit) - Negative range:
-1to-2^(n-1)(-1 to -128 for 8-bit) - Thereβs one more negative than positive β this asymmetry is inherent to twoβs complement
1111_1111is -1 (not -127), and1000_0000is -128 (not -0)- -1 is all ones β
0xFFin 8-bit,0xFF...FFin 256-bit. This is whyuint256(int256(-1)) == type(uint256).max
Common Values to Memorize
| Value | 8-bit | 256-bit |
|---|---|---|
| 0 | 0x00 | 0x00...00 |
| 1 | 0x01 | 0x00...01 |
| -1 | 0xFF | 0xFF...FF |
| max positive | 0x7F (127) | 0x7F...FF (2Β²β΅β΅ - 1) |
| min negative | 0x80 (-128) | 0x80...00 (-2Β²β΅β΅) |
π‘ Concept: Why Addition Just Works
The beauty of twoβs complement: The CPU doesnβt need separate circuits for signed and unsigned addition. The same binary addition works for both β overflow just wraps around naturally.
Example: 3 + (-5) = -2
0000_0011 ( 3)
+ 1111_1011 (-5)
βββββββββββ
1111_1110 (-2 β)
Verify: flip + add 1:
~1111_1110 = 0000_0001
0000_0001 + 1 = 0000_0010 = 2
So the original was -2 β
This is why the EVM only has one ADD opcode, not separate signed/unsigned versions. At the bit level, add(a, b) is the same regardless of interpretation.
Where signed/unsigned diverge: Comparison (slt vs lt) and division (sdiv vs div). The bits are the same β the interpretation differs.
// 0xFF...FF as uint256 = 2Β²β΅βΆ - 1 (huge positive)
// 0xFF...FF as int256 = -1
lt(0xFFFFFFFF...FF, 1) // 0 (false: 2Β²β΅βΆ-1 is NOT less than 1)
slt(0xFFFFFFFF...FF, 1) // 1 (true: -1 IS less than 1)
π‘ Concept: Sign Extension
The problem: You have an int8 value (-5 = 0xFB) stored in a 256-bit word. The EVM sees 0x00000000...000000FB. But thatβs +251 as a uint256, not -5. How does the EVM know itβs negative?
The answer: When you cast a smaller signed type to a larger one, Solidity sign-extends β it copies the sign bit into all the new upper bits.
int8(-5) in 8 bits: 1111_1011
^
sign bit (1 = negative)
Sign-extend to 256 bits:
1111_1111_1111_1111_..._1111_1111_1111_1011
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
All new bits copy the sign bit (1)
Result: 0xFFFFFFFF...FFFFFFFB = -5 as int256 β
For a positive value, sign extension fills with zeros (because the sign bit is 0):
int8(5) in 8 bits: 0000_0101
^
sign bit (0 = positive)
Sign-extend to 256 bits:
0000_0000_0000_0000_..._0000_0000_0000_0101
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
All new bits copy the sign bit (0)
Result: 0x00000000...00000005 = 5 as int256 β
Why this preserves the value: Twoβs complement guarantees that sign extension doesnβt change the numberβs value. The proof is simple β adding leading 1s to a negative number is like adding 0xFF00 to 0xFB, which in twoβs complement is the same as adding -256 + 256 = 0.
In Solidity, sign extension happens automatically when you cast:
int8 small = -5;
int256 big = int256(small); // Compiler inserts sign extension
// big == -5 (0xFFFFFFFF...FFFFFFFB)
In assembly, you must handle it yourself β see SIGNEXTEND below.
π‘ Concept: The SIGNEXTEND Opcode
Why this matters: When you read a packed signed integer from storage or calldata in assembly, you get raw bits with no type information. If the value was an int8, you have 8 meaningful bits inside a 256-bit word. SIGNEXTEND tells the EVM: βtreat byte b as the sign byte and extend it.β
Syntax: signextend(b, x) β extends the sign of byte b (0-indexed from the least significant end) through all higher bits.
signextend(0, x) β treat x as int8 (1 byte), sign-extend to 256 bits
signextend(1, x) β treat x as int16 (2 bytes), sign-extend to 256 bits
signextend(2, x) β treat x as int24 (3 bytes), sign-extend to 256 bits
signextend(15, x) β treat x as int128 (16 bytes), sign-extend to 256 bits
Walkthrough:
x = 0x00000000...0000009C (156 in decimal)
As uint8: 156 (positive)
As int8: -100 (negative! because bit 7 of byte 0 is 1)
signextend(0, x):
Byte 0 of x = 0x9C = 1001_1100
^
sign bit is 1 β extend with 1s
Result: 0xFFFFFFFF...FFFFFF9C = -100 as int256 β
Another example with a positive value:
x = 0x00000000...0000005A (90 in decimal)
signextend(0, x):
Byte 0 of x = 0x5A = 0101_1010
^
sign bit is 0 β extend with 0s (no change)
Result: 0x00000000...0000005A = 90 as int256 β
Common pattern β unpacking a signed field from storage:
// Packed: [uint128 amount0 | int128 amount1]
// amount1 is in the lower 128 bits
let raw := and(packed, 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF) // Extract lower 128 bits
let amount1 := signextend(15, raw) // byte 15 = bit 127 = sign bit of int128
Without SIGNEXTEND, raw containing a negative int128 would be interpreted as a huge positive number.
π‘ Concept: SAR Revisited
Now that you understand twoβs complement and sign extension, SAR makes complete sense.
SAR is a signed division by 2^n β it must preserve the sign. It does this by filling the vacated upper bits with the sign bit, exactly like sign extension.
SAR on -6 (shift right by 1 = divide by 2):
-6 = 0xFFFFFFFF...FFFFFFFA
Binary: ...1111_1010
SAR 1: ...1111_1101 = -3 β
^
sign bit (1) fills in from the left
SHR on -6 (same bits, treated as unsigned):
0xFFFFFFFF...FFFFFFFA (huge positive number in unsigned)
SHR 1: 0x7FFFFFFFFFFF...FD (half of that huge number)
^
zero fills in from the left β positive result
Rounding behavior: SAR rounds toward negative infinity, not toward zero.
SAR 1 on -7:
-7 = ...1111_1001
SAR 1: ...1111_1100 = -4 (rounds down, not toward zero)
Integer division: -7 / 2 = -3 (rounds toward zero)
SAR: -7 >> 1 = -4 (rounds toward -β)
This difference matters when precision counts. If you need round-toward-zero behavior for signed division, use sdiv instead of SAR.
π‘ Masking Patterns
A mask is a bit pattern used to select, modify, or test specific bits within a word. If bitwise operators are the tools, masks are the stencils. Every packing/unpacking operation, every flag check, every field update uses a mask.
This section teaches masking as a system of patterns β learn these four operations and you can read or write any packed data structure.
π‘ Concept: What Is a Mask?
A mask is a pattern of 1s and 0s that selects which bits to operate on.
The simplest mask is a contiguous run of 1s:
0x000000000000000000000000000000000000000000000000FFFFFFFFFFFFFFFF
^^^^^^^^^^^^^^^^
64 bits of 1s = lower 64 bits selected
Building Masks
A mask of width ones (starting from bit 0):
uint256 mask = (1 << width) - 1;
Why this works:
1 << 8 = 0b_1_0000_0000 (256)
256 - 1 = 0b_0_1111_1111 (255 = eight 1s)
A positioned mask (at a given offset):
uint256 positioned = mask << offset;
mask = 0xFF (8 bits)
offset = 16
positioned:
0b_0000_0000_1111_1111_0000_0000_0000_0000
^^^^^^^^
8 ones at bit position 16-23
Visual: A 256-Bit Word With Fields
Bit position: 255 ... 192 191 ... 128 127 ... 64 63 ... 0
ββββββββββββ¬βββββββββββ¬βββββββββββ¬βββββββββββ
β Field D β Field C β Field B β Field A β
β 64 bits β 64 bits β 64 bits β 64 bits β
ββββββββββββ΄βββββββββββ΄βββββββββββ΄βββββββββββ
Mask for Field A: (1 << 64) - 1 = 0x00...00FFFFFFFFFFFFFFFF
Mask for Field B: ((1 << 64) - 1) << 64 = 0x00...FFFFFFFFFFFFFFFF00...00
Mask for Field C: ((1 << 64) - 1) << 128
Mask for Field D: ((1 << 64) - 1) << 192
π‘ Concept: Extract a Field
Goal: Read a fieldβs value from a packed word.
Pattern: Shift right to position 0, then AND with mask to discard everything above.
extract(word, offset, width):
return (word >> offset) & ((1 << width) - 1)
Step-by-step example: Extract Field B (bits 64-127) from a packed word:
Packed word:
0x_DDDDDDDDDDDDDDDD_CCCCCCCCCCCCCCCC_BBBBBBBBBBBBBBBB_AAAAAAAAAAAAAAAA
Step 1: Shift right by 64
0x_0000000000000000_DDDDDDDDDDDDDDDD_CCCCCCCCCCCCCCCC_BBBBBBBBBBBBBBBB
^^^^^^^^^^^^^^^^
Field B is now at position 0
Step 2: AND with mask (64 bits of 1s)
0x_0000000000000000_0000000000000000_0000000000000000_BBBBBBBBBBBBBBBB
^^^^^^^^^^^^^^^^
Only Field B remains
In Solidity:
uint256 fieldB = (packed >> 64) & type(uint64).max;
In assembly:
let fieldB := and(shr(64, packed), 0xFFFFFFFFFFFFFFFF)
Alternative Form: Mask First, Then Shift
extract(word, offset, width):
return (word & (mask << offset)) >> offset
Both forms produce identical results. The first (shift-then-mask) is slightly more common because the mask is always at position 0, making it simpler.
π‘ Concept: Set a Field (Write)
Goal: Replace a fieldβs value in a packed word without disturbing other fields.
Pattern: Clear the field, then OR in the new value.
set(word, offset, width, newValue):
mask = ((1 << width) - 1) << offset
return (word & ~mask) | ((newValue << offset) & mask)
This is a read-modify-write operation β three steps:
Step-by-step: Set Field B (bits 64-127) to 0x1234567812345678:
Packed word:
0x_DDDDDDDDDDDDDDDD_CCCCCCCCCCCCCCCC_BBBBBBBBBBBBBBBB_AAAAAAAAAAAAAAAA
Step 1: Create positioned mask
mask << 64 = 0x_0000000000000000_0000000000000000_FFFFFFFFFFFFFFFF_0000000000000000
Step 2: Clear the field β AND with inverted mask
~mask = 0x_FFFFFFFFFFFFFFFF_FFFFFFFFFFFFFFFF_0000000000000000_FFFFFFFFFFFFFFFF
word & ~mask:
0x_DDDDDDDDDDDDDDDD_CCCCCCCCCCCCCCCC_0000000000000000_AAAAAAAAAAAAAAAA
^^^^^^^^^^^^^^^^
Field B is now zero
Step 3: Position new value and OR
newValue << 64 = 0x_0000000000000000_0000000000000000_1234567812345678_0000000000000000
cleared | positioned:
0x_DDDDDDDDDDDDDDDD_CCCCCCCCCCCCCCCC_1234567812345678_AAAAAAAAAAAAAAAA
^^^^^^^^^^^^^^^^
New value in place, others untouched β
In assembly:
// Clear field B (bits 64-127), then write new value
let mask := shl(64, 0xFFFFFFFFFFFFFFFF)
let cleared := and(packed, not(mask))
let result := or(cleared, shl(64, newValue))
π‘ Concept: Clear a Field
Goal: Zero out a specific field without touching others.
Pattern: AND with the inverted positioned mask.
clear(word, offset, width):
mask = ((1 << width) - 1) << offset
return word & ~mask
This is just step 2 of the βset a fieldβ pattern β without the OR step.
// Clear bits 16-31
uint256 cleared = packed & ~(uint256(0xFFFF) << 16);
let cleared := and(packed, not(shl(16, 0xFFFF)))
π‘ Concept: Toggle a Bit
Goal: Flip a single bit β if itβs 0, make it 1; if itβs 1, make it 0.
Pattern: XOR with a 1 at the target position.
toggle(word, position):
return word ^ (1 << position)
XORβs self-inverse property means toggling twice returns to the original:
word ^ (1 << 5) ^ (1 << 5) == word // always true
Before: ...0110_1010
Toggle bit 4:
XOR ...0001_0000
Result: ...0111_1010
^
flipped from 0 to 1
Toggle bit 4 again:
XOR ...0001_0000
Result: ...0110_1010 β back to original
π‘ Concept: Common Masking Mistakes
Mistake 1: OR Without Clearing First
// β οΈ WRONG β bits accumulate instead of replace:
packed = packed | (newValue << offset);
// If the field already had non-zero bits, they persist!
// Old: ...1111_0000... Field was 0xF0
// OR: ...1010_0000... New value 0xA0
// Got: ...1111_0000... Still 0xF0! (1 OR 1 = 1)
Fix: Always clear before OR-ing:
packed = (packed & ~(mask << offset)) | (newValue << offset);
Mistake 2: Off-by-One in Shift Amounts
// Field is at bits 16-31 (16 bits wide, starting at bit 16)
uint256 value = (packed >> 15) & 0xFFFF; // β οΈ WRONG β shifted by 15, not 16
uint256 value = (packed >> 16) & 0xFFFF; // β
Correct
The shift amount is the starting bit position, not βthe bit before the field.β
Mistake 3: Mask Too Wide
// Trying to extract a 16-bit field but using 32-bit mask
uint256 value = (packed >> 16) & 0xFFFFFFFF; // β οΈ Reads 32 bits β includes adjacent field!
uint256 value = (packed >> 16) & 0xFFFF; // β
Reads exactly 16 bits
Mistake 4: Dirty Upper Bits When Packing
// newValue might have bits above the field width
uint256 newValue = 0x1FFFF; // 17 bits! But field is only 16 bits wide
packed = (packed & ~(0xFFFF << 16)) | (newValue << 16);
// β οΈ The extra bit (bit 32) corrupts the adjacent field!
// Fix: mask the value before packing
packed = (packed & ~(0xFFFF << 16)) | ((newValue & 0xFFFF) << 16);
π‘ Bitmap Sets
A bitmap is a data structure where each bit represents the presence or absence of an element. A single uint256 can track 256 boolean flags with O(1) access β set, check, clear, and even set-algebraic operations like union and intersection, all in a single operation.
π‘ Concept: uint256 as 256 Booleans
Why this matters: Storing 256 separate bool variables costs 256 storage slots (256 Γ 2,100 = 537,600 gas cold). A bitmap stores them all in one slot (2,100 gas cold). Thatβs a 256x gas reduction.
uint256 bitmap:
βββββ¬ββββ¬ββββ¬ββββ¬ββββ¬ββββββ¬ββββ¬ββββ¬ββββ¬ββββ
β255β254β253β...β 5 β 4 β 3 β 2 β 1 β 0 β β bit position = element index
β 0 β 0 β 1 β...β 1 β 0 β 1 β 0 β 1 β 1 β β 1 = present, 0 = absent
βββββ΄ββββ΄ββββ΄ββββ΄ββββ΄ββββββ΄ββββ΄ββββ΄ββββ΄ββββ
This bitmap contains elements: {0, 1, 3, 5, 253, ...}
π‘ Concept: Core Operations
All bitmap operations are single instructions:
Check Membership
βIs element i in the set?β
bool member = ((bitmap >> i) & 1) != 0;
let member := and(shr(i, bitmap), 1)
Shift the target bit to position 0, then AND with 1 to isolate it.
Add to Set
βAdd element i to the set.β
bitmap = bitmap | (1 << i);
bitmap := or(bitmap, shl(i, 1))
OR with a 1 at position i. If the bit is already set, OR is idempotent β no harm done.
Remove from Set
βRemove element i from the set.β
bitmap = bitmap & ~(1 << i);
bitmap := and(bitmap, not(shl(i, 1)))
AND with 0 at position i (the inverted mask). All other bits preserved.
Toggle Membership
βFlip element i β add if absent, remove if present.β
bitmap = bitmap ^ (1 << i);
Check if Empty
bool empty = bitmap == 0;
let empty := iszero(bitmap)
π‘ Concept: Set Algebra
The power of bitmaps: set-theoretic operations are single instructions.
Set A: 0b_1100_1010
Set B: 0b_1010_0110
Union (A βͺ B) β elements in A OR B:
A | B = 0b_1110_1110
Intersection (A β© B) β elements in BOTH:
A & B = 0b_1000_0010
Difference (A \ B) β elements in A but NOT B:
A & ~B = 0b_0100_1000
Symmetric Difference (A β³ B) β elements in one but NOT both:
A ^ B = 0b_0110_1100
Each of these is a single EVM opcode (3 gas). Compared to looping through arrays and checking membership, this is orders of magnitude more efficient.
// Check if setA is a subset of setB (every element of A is in B)
bool isSubset = (setA & setB) == setA;
// Check if sets overlap (share any elements)
bool overlaps = (setA & setB) != 0;
// Merge two sets
uint256 merged = setA | setB;
π‘ Concept: Population Count (Counting Set Bits)
βHow many elements are in this bitmap set?β
This is the Hamming weight or popcount problem. Thereβs no EVM opcode for it, but thereβs an elegant O(log n) algorithm.
The Naive Approach
function popcount(uint256 x) pure returns (uint256 count) {
while (x != 0) {
count += x & 1;
x >>= 1;
}
}
This loops up to 256 times. For a small number of set bits, thereβs a faster loop using βclear lowest set bitβ:
function popcount(uint256 x) pure returns (uint256 count) {
while (x != 0) {
x = x & (x - 1); // Clear lowest set bit
count++;
}
}
This loops once per set bit β much faster when only a few bits are set.
The Parallel Bit-Counting Trick
For a constant-time solution, count bits in parallel using divide-and-conquer:
Concept (shown with 8 bits for clarity):
Input: 1 0 1 1 0 1 1 0
Step 1: Count pairs of bits (add adjacent bits)
01 10 01 01
(1) (2) (1) (1)
Step 2: Count groups of 4 (add adjacent pairs)
0011 0010
(3) (2)
Step 3: Count groups of 8 (add adjacent quads)
00000101
(5) β 5 bits were set β
Each step doubles the group size, using masks to isolate alternating groups:
// Step 1: Count bits in pairs
// Mask 0x55...55 = ...0101_0101 (selects even-position bits)
x = (x & 0x5555...5555) + ((x >> 1) & 0x5555...5555);
// Step 2: Count bits in nibbles (groups of 4)
// Mask 0x33...33 = ...0011_0011 (selects lower 2 of each 4)
x = (x & 0x3333...3333) + ((x >> 2) & 0x3333...3333);
// Step 3: Count bits in bytes (groups of 8)
x = (x & 0x0F0F...0F0F) + ((x >> 4) & 0x0F0F...0F0F);
// Continue doubling: 16, 32, 64, 128, 256
// ... (or use a multiply trick to sum all bytes at once)
The total operation count is constant regardless of input β no loops, no branches.
π‘ Concept: Iterating Over Set Bits
βProcess each element in the bitmap set.β
Pattern: repeatedly isolate and clear the lowest set bit.
function forEachBit(uint256 bitmap) pure {
while (bitmap != 0) {
// Isolate the lowest set bit
uint256 lsb = bitmap & (~bitmap + 1); // same as bitmap & (-bitmap)
// Find its position (index)
// lsb is a power of 2 β its log2 gives the bit position
uint256 index = log2(lsb);
// Process element 'index'
// ...
// Clear the lowest set bit and continue
bitmap = bitmap & (bitmap - 1);
}
}
This visits each set bit exactly once, in order from lowest to highest.
π‘ Concept: Multi-Word Bitmaps
When 256 bits arenβt enough, extend to multiple words:
mapping(uint256 => uint256) private bitmap;
function isSet(uint256 index) internal view returns (bool) {
uint256 wordIndex = index >> 8; // index / 256
uint256 bitIndex = index & 0xFF; // index % 256
return ((bitmap[wordIndex] >> bitIndex) & 1) != 0;
}
function set(uint256 index) internal {
uint256 wordIndex = index >> 8;
uint256 bitIndex = index & 0xFF;
bitmap[wordIndex] |= (1 << bitIndex);
}
Note the use of bit tricks for the index decomposition:
index >> 8isindex / 256(SHR is cheaper than DIV)index & 0xFFisindex % 256(AND is cheaper than MOD)
Both work because 256 is a power of 2. This pattern is covered in detail in the Power-of-2 Tricks section.
π‘ LSB & MSB Isolation
Isolating the lowest or highest set bit in a word is one of the most useful bit manipulation primitives. Itβs the core of bitmap iteration, binary search on bit positions, and several algorithmic tricks.
π‘ Concept: Isolate the Lowest Set Bit
The trick: x & (-x)
Result: A value with only the lowest set bit of x remaining. All other bits are zero.
x = 0b_1010_1100
-x = 0b_0101_0100 (flip + add 1)
x & -x = 0b_0000_0100 β only the lowest set bit
Why it works β step by step:
x = ...1010_1100
^^
lowest set bit is here (bit 2)
Step 1: ~x (flip all bits)
~x = ...0101_0011
Step 2: ~x + 1 = -x (add 1)
-x = ...0101_0100
The +1 carry propagates through the trailing 1s of ~x,
flipping them to 0, until it reaches the first 0 (which was
the lowest set bit of x), flipping it to 1.
Step 3: x & -x
x = ...1010_1100
-x = ...0101_0100
AND= ...0000_0100 β only bit 2 survives
The key insight: -x has the same lowest set bit as x, but all higher bits are inverted. AND-ing them cancels everything except that one bit.
let lsb := and(x, sub(0, x)) // x & (-x)
The result is always a power of 2 (or zero if x was zero).
π‘ Concept: Clear the Lowest Set Bit
The trick: x & (x - 1)
Result: x with its lowest set bit turned off.
x = 0b_1010_1100
x - 1 = 0b_1010_1011 (subtracting 1 flips the lowest set bit and all zeros below it)
x & x-1= 0b_1010_1000 β lowest set bit cleared
Why it works:
x = ...1010_1100
^
lowest set bit
x - 1 = ...1010_1011
^
Subtraction borrows from the lowest set bit:
- That bit flips 0β1... wait, let's trace carefully:
- Starting from the right, bit 0 is 0, borrow propagates
- Bit 1 is 0, borrow propagates
- Bit 2 is 1, borrow stops: this bit becomes 0, all bits below become 1
x-1 = ...1010_1011
x & (x-1):
...1010_1100
& ...1010_1011
= ...1010_1000
The lowest set bit and everything below it are now all different,
so AND zeroes them. Everything above is identical, so AND preserves them.
This is the foundation of several tricks:
- Iteration: Clear the lowest bit after processing it
isPowerOfTwo: A power of 2 has exactly one bit, sox & (x-1) == 0- Counting set bits: Loop: clear lowest bit, increment counter, repeat until zero
π‘ Concept: Count Trailing Zeros (CTZ)
βHow many zero bits are below the lowest set bit?β
This tells you the position of the lowest set bit. Equivalent to log2(x & -x).
Approach 1: Loop
function ctz(uint256 x) pure returns (uint256 count) {
if (x == 0) return 256; // No set bits
uint256 lsb = x & (~x + 1); // Isolate LSB
// lsb is a power of 2 β count how many times to shift
while (lsb > 1) {
lsb >>= 1;
count++;
}
}
Worst case: 255 iterations.
Approach 2: Binary Search
The same approach used for finding MSB (discussed later), but applied to the isolated LSB:
function ctz(uint256 x) pure returns (uint256 r) {
if (x == 0) return 256;
x = x & (~x + 1); // Isolate LSB (now a power of 2)
// Binary search: is the set bit in the upper half?
if (x >= (1 << 128)) { r += 128; x >>= 128; }
if (x >= (1 << 64)) { r += 64; x >>= 64; }
if (x >= (1 << 32)) { r += 32; x >>= 32; }
if (x >= (1 << 16)) { r += 16; x >>= 16; }
if (x >= (1 << 8)) { r += 8; x >>= 8; }
if (x >= (1 << 4)) { r += 4; x >>= 4; }
if (x >= (1 << 2)) { r += 2; x >>= 2; }
if (x >= (1 << 1)) { r += 1; }
}
8 comparisons regardless of input. Constant-time.
Approach 3: De Bruijn Sequence (Constant-Time Lookup)
A de Bruijn sequence is a special number where every n-bit substring (when shifted) is unique. By multiplying the isolated LSB by this magic constant and shifting, you get a unique index that maps to a lookup table.
// For 256-bit values, the de Bruijn constant and table are large.
// Simplified concept with 8 bits:
// De Bruijn constant for 8 bits: 0x17 (0b_00010111)
// Every 3-bit window in its binary expansion is unique:
// 000, 001, 010, 101, 011, 111, 110, 100
// Multiply isolated LSB by constant, shift to get table index
uint8 index = uint8((lsb * DEBRUIJN) >> 5); // Top 3 bits = index
uint8 position = LOOKUP_TABLE[index];
This is O(1) with no branches β a multiply, a shift, and a table lookup. Production implementations use this for 256-bit words with a 256-entry lookup table.
π‘ Concept: Find the Highest Set Bit (MSB / Bit Length)
βWhat is the position of the most significant set bit?β
Equivalently: βHow many bits does this number need?β (its bit-length minus 1).
Approach: Binary search by halving.
The idea: Is the highest set bit in the upper 128 bits or the lower 128? If upper, record 128 and shift right to focus on that half. Repeat with 64, 32, 16, 8, 4, 2, 1.
Input: x = 0x00000000_00000000_00000000_00000005_00000000_00000000_00000000_00000000
Is x >= 2^128? Yes β r = 128, x >>= 128
Now x = 0x00000005
Is x >= 2^64? No
Is x >= 2^32? No
Is x >= 2^16? No
Is x >= 2^8? No
Is x >= 2^4? No
Is x >= 2^2? Yes β r += 2 = 130, x >>= 2
Now x = 0x01
Is x >= 2^1? No
Result: MSB position = 130
In assembly (branchless version):
function findMSB(x) -> r {
// Each step: if x is large enough, add to r and shift down
let bit := shl(7, lt(0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF, x))
r := or(r, bit)
x := shr(bit, x)
bit := shl(6, lt(0xFFFFFFFFFFFFFFFF, x))
r := or(r, bit)
x := shr(bit, x)
bit := shl(5, lt(0xFFFFFFFF, x))
r := or(r, bit)
x := shr(bit, x)
bit := shl(4, lt(0xFFFF, x))
r := or(r, bit)
x := shr(bit, x)
bit := shl(3, lt(0xFF, x))
r := or(r, bit)
x := shr(bit, x)
bit := shl(2, lt(0xF, x))
r := or(r, bit)
x := shr(bit, x)
bit := shl(1, lt(0x3, x))
r := or(r, bit)
x := shr(bit, x)
r := or(r, lt(0x1, x))
}
How the branchless trick works: lt(threshold, x) returns 0 or 1. shl(7, ...) turns that into 0 or 128. or(r, ...) conditionally adds 128 to the result β no JUMPI needed.
This is the same pattern used in Soladyβs log2 implementation.
π‘ Concept: Count Leading Zeros (CLZ)
βHow many zero bits are above the highest set bit?β
CLZ = 255 - MSB_position
For x = 0, CLZ is 256 (all bits are zero).
This is simply the inverse of finding the MSB:
function clz(uint256 x) pure returns (uint256) {
if (x == 0) return 256;
return 255 - findMSB(x);
}
CLZ is useful for:
- Determining the bit-width of a value (256 - CLZ = bit width)
- Normalizing values for fixed-point operations
- Efficient comparisons (βdoes this number fit in N bits?β)
π‘ Power-of-2 Tricks
Powers of 2 have a special property: they have exactly one bit set. This makes them uniquely suited to bit manipulation tricks β faster than general-purpose arithmetic for alignment, modulo, and range checks.
π‘ Concept: Is Power of Two
The trick: x != 0 && (x & (x - 1)) == 0
Why it works: A power of 2 has exactly one bit set. x - 1 flips that bit and sets all bits below it. AND-ing gives zero β the only bit they shared was the one that got flipped.
x = 8: 0b_0000_1000 (one bit set β power of 2)
x - 1 = 7: 0b_0000_0111
AND: 0b_0000_0000 β zero! β
x = 6: 0b_0000_0110 (two bits set β not power of 2)
x - 1 = 5: 0b_0000_0101
AND: 0b_0000_0100 β non-zero! β
The x != 0 check is needed because 0 & (0 - 1) wraps to 0 & 0xFF...FF = 0, which would incorrectly report zero as a power of 2.
function isPowerOfTwo(uint256 x) pure returns (bool) {
return x != 0 && (x & (x - 1)) == 0;
}
π‘ Concept: Next Power of Two (Round Up)
Goal: Find the smallest power of 2 that is β₯ x.
The approach: Fill all bits below the highest set bit with 1s, then add 1.
x = 0b_0010_1001 (decimal 41)
Step 1: Smear bits downward
x |= x >> 1 β 0b_0011_1101
x |= x >> 2 β 0b_0011_1111
x |= x >> 4 β 0b_0011_1111 (already done for 8-bit)
After smearing: 0b_0011_1111 (all bits below MSB are set)
Step 2: Add 1
0b_0011_1111 + 1 = 0b_0100_0000 (decimal 64)
64 is the next power of 2 β₯ 41 β
Why the smearing works: Each x |= x >> N copies the highest set bit down by N positions. After all shifts, every bit from the MSB down to bit 0 is set. Adding 1 carries through all those 1s, producing a single bit one position above the original MSB.
For 256-bit values, you need shifts of 1, 2, 4, 8, 16, 32, 64, and 128:
function nextPowerOfTwo(uint256 x) pure returns (uint256) {
if (x == 0) return 1;
x -= 1; // Handle exact powers of 2 (e.g., 64 β 64, not 128)
x |= x >> 1;
x |= x >> 2;
x |= x >> 4;
x |= x >> 8;
x |= x >> 16;
x |= x >> 32;
x |= x >> 64;
x |= x >> 128;
return x + 1;
}
The x -= 1 at the start ensures that exact powers of 2 round to themselves (not the next one).
π‘ Concept: Rounding to Alignment Boundaries
Goal: Round a value up or down to the nearest multiple of a power-of-2 alignment.
Round Down (Floor)
roundDown(x, alignment):
return x & ~(alignment - 1)
This clears the lower bits β equivalent to rounding down to the nearest multiple.
x = 37, alignment = 16
alignment - 1 = 15 = 0b_0000_1111
~15 = 0b_1111_0000
37 & ~15:
0b_0010_0101 (37)
& 0b_1111_0000
= 0b_0010_0000 (32) β nearest multiple of 16 β€ 37 β
Round Up (Ceiling)
roundUp(x, alignment):
return (x + alignment - 1) & ~(alignment - 1)
Add alignment - 1 first to βpushβ past the boundary if not already aligned, then round down.
x = 37, alignment = 16
37 + 15 = 52
52 & ~15:
0b_0011_0100 (52)
& 0b_1111_0000
= 0b_0011_0000 (48) β nearest multiple of 16 β₯ 37 β
EVM memory uses this pattern for 32-byte word alignment:
// Round size up to nearest 32-byte boundary
let aligned := and(add(size, 31), not(31))
π‘ Concept: Modulo by Power of 2
The trick: x & (modulus - 1) replaces x % modulus when modulus is a power of 2.
Why it works: The remainder of dividing by 2^n is exactly the lower n bits.
x = 53, modulus = 16 (2β΄)
53 % 16 = 5
53 in binary: 0b_0011_0101
16 - 1 = 15: 0b_0000_1111
53 & 15:
0b_0011_0101
& 0b_0000_1111
= 0b_0000_0101 (5) β
The lower 4 bits ARE the remainder when dividing by 16. The upper bits represent how many complete groups of 16 fit.
// These are equivalent (when modulus is a power of 2):
uint256 r = x % 256; // MOD opcode: 5 gas
uint256 r = x & 255; // AND opcode: 3 gas
uint256 r = x & 0xFF; // Same thing, hex notation
This pattern appears in multi-word bitmap indexing:
uint256 wordIndex = index >> 8; // index / 256
uint256 bitIndex = index & 0xFF; // index % 256
π‘ Common Bit Idioms
These are recurring patterns built from the primitives above. They appear in production code, gas-optimized libraries, and assembly routines. Each one replaces a branching operation with pure arithmetic.
π‘ Concept: Branchless Conditional Select
Goal: Select value a or b based on a condition, without branching (no if, no JUMPI).
Pattern:
select(condition, a, b):
mask = 0 - condition // condition is 0 or 1
return b ^ (mask & (a ^ b)) // if mask=0: b; if mask=all-1s: a
How it works:
Case 1: condition = 1 (select a)
mask = 0 - 1 = 0xFFFF...FFFF (all 1s)
a ^ b = difference bits
mask & (a ^ b) = a ^ b (all bits pass through)
b ^ (a ^ b) = a β
Case 2: condition = 0 (select b)
mask = 0 - 0 = 0x0000...0000 (all 0s)
mask & (a ^ b) = 0 (all bits blocked)
b ^ 0 = b β
Why branchless matters: In the EVM, JUMPI costs 10 gas and introduces a conditional code path. Branchless select uses AND + XOR (6 gas total) and executes the same instructions regardless of the condition β constant gas, no jump.
// Select between two values based on a boolean (0 or 1)
let mask := sub(0, condition)
let result := xor(b, and(mask, xor(a, b)))
π‘ Concept: Branchless Min / Max
Built on conditional select:
// min(a, b)
let isLess := lt(a, b) // 1 if a < b, 0 otherwise
let mask := sub(0, isLess) // all-1s if a < b, all-0s otherwise
let minimum := xor(b, and(mask, xor(a, b))) // select a if a < b, else b
// max(a, b) β flip the condition
let isGreater := lt(b, a)
let mask2 := sub(0, isGreater)
let maximum := xor(b, and(mask2, xor(a, b)))
Or more directly:
// min(a, b) β equivalent but easier to read
let minimum := xor(a, and(xor(a, b), sub(0, lt(b, a))))
Trace with a = 5, b = 3:
lt(b, a) = lt(3, 5) = 1
sub(0, 1) = 0xFFFF...FFFF (all 1s)
xor(a, b) = xor(5, 3) = 6
and(0xFFFF...FFFF, 6) = 6
xor(a, 6) = xor(5, 6) = 3 = min(5, 3) β
π‘ Concept: Branchless Absolute Value
The trick: Use SAR to extract the sign, then conditionally negate.
let mask := sar(255, x) // All-zeros if x β₯ 0, all-ones if x < 0
let abs_x := xor(add(x, mask), mask)
How it works:
Case 1: x = 5 (positive)
sar(255, 5) = 0x00...00 (all zeros β sign bit was 0)
add(5, 0) = 5
xor(5, 0) = 5 β
Case 2: x = -5 (negative)
sar(255, -5) = 0xFF...FF (all ones β sign bit was 1)
add(-5, 0xFF...FF) = add(-5, -1) = -6
xor(-6, 0xFF...FF) = ~(-6) = 5 β
Why xor(x, all_ones) = ~x: XOR with 1 flips each bit β same as NOT.
Why ~(-6) = 5: In twoβs complement, ~x = -x - 1, so ~(-6) = 6 - 1 = 5.
The combined effect of add(x, -1) then NOT is: ~(x - 1) = -x. This is the twoβs complement negation rule in disguise.
π‘ Concept: Swap Without Temporary
The XOR swap:
a := xor(a, b) // a now holds a^b
b := xor(b, a) // b = b ^ (a^b) = a (original a)
a := xor(a, b) // a = (a^b) ^ a = b (original b)
Trace with a = 5, b = 3:
a = xor(5, 3) = 6
b = xor(3, 6) = 5 (original a β)
a = xor(6, 5) = 3 (original b β)
Why it works: XOR is self-inverse. a ^ b ^ b = a. Each step undoes one layer.
β οΈ Warning: Aliasing
If a and b refer to the same memory location (or same variable), XOR swap fails:
// If a and b alias the same location:
a = xor(a, a) = 0 // Oops β zeroed out!
b = xor(0, 0) = 0
a = xor(0, 0) = 0 // Both are zero. Data lost.
In practice, this aliasing issue rarely occurs in EVM assembly (stack variables donβt alias), but be aware of it for memory/storage operations.
π‘ Concept: Dirty Bits & Cleaning
The problem: The EVM operates on 256-bit words, but most types are smaller. An address is 160 bits, a uint96 is 96 bits, a bool is 1 bit. The remaining bits in the word can contain leftover data from previous operations β these are dirty bits.
Why it matters:
// Suppose 'addr' has garbage in the upper 96 bits:
// addr = 0xDEADBEEF_AAAAAAAA_BBBBBBBB_11111111_22222222_33333333_44444444_55555555
// ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// dirty bits (should be zero for a 160-bit address)
// Comparing with another clean address will FAIL even if the lower 160 bits match:
let same := eq(addr, clean_addr) // Returns 0 (false) because upper bits differ!
Cleaning patterns:
// Method 1: AND with mask
let clean := and(addr, 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF) // 160-bit mask
// Method 2: Shift left then right (clears upper bits)
let clean := shr(96, shl(96, addr)) // shift left 96 (push dirty bits off), shift right 96
// Method 3: For Solidity-level code, explicit cast
address cleaned = address(uint160(dirtyValue));
When dirty bits appear:
- Reading raw calldata in assembly (callers can send garbage in upper bytes)
- After arithmetic on packed fields (carry bits can leak)
- After
mloadon memory that wasnβt fully initialized - Return values from external calls (malicious contracts can return dirty data)
The rule: Always clean values before using them in comparisons, storage writes, or LOG topics. Dirty bits in storage waste gas (non-zero bytes cost more) and can cause logic errors.
π‘ Concept: Checking Multiple Conditions at Once
Goal: Check if several boolean conditions are all true (or all false) in a single operation.
Pattern: Pack conditions into a bitmap, then use AND to check them all at once.
uint256 constant ACTIVE = 1 << 0; // bit 0
uint256 constant NOT_PAUSED = 1 << 1; // bit 1
uint256 constant HAS_BALANCE = 1 << 2; // bit 2
uint256 constant REQUIRED = ACTIVE | NOT_PAUSED | HAS_BALANCE; // 0b_111
// Check all conditions at once
uint256 flags = ...; // Each bit set based on current state
bool allMet = (flags & REQUIRED) == REQUIRED;
How it works:
flags = 0b_101 (ACTIVE and HAS_BALANCE, but paused)
REQUIRED = 0b_111
flags & REQUIRED = 0b_101
0b_101 == 0b_111 β false (NOT_PAUSED bit missing)
Checking forbidden conditions (none should be set):
uint256 constant FORBIDDEN = BLACKLISTED | EXPIRED;
bool noForbidden = (flags & FORBIDDEN) == 0;
This replaces chained && conditions with a single AND + comparison β fewer opcodes, constant gas, cleaner code.
π― Build Exercise: BitToolkit
Workspace:
- Implementation:
workspace/src/deep-dives/bit-manipulation/exercise1-bit-toolkit/BitToolkit.sol - Tests:
workspace/test/deep-dives/bit-manipulation/exercise1-bit-toolkit/BitToolkit.t.sol
Build a reusable bit manipulation library that implements the core patterns from this deep dive. The library works with packed uint256 words and bitmap sets.
What youβll implement:
extractFieldβ Extract a field ofwidthbits atoffsetfrom a packed wordsetFieldβ Write a value into a field (read-modify-write pattern)bitmapAdd/bitmapRemove/bitmapContainsβ Bitmap set operationspopcountβ Count the number of set bits in a wordisolateLSBβ Isolate the lowest set bitisPowerOfTwoβ Check if a value is a power of 2branchlessSelectβ Select between two values without branching
π― Goal: Internalize the core bit manipulation patterns so that reading (and writing) packed data structures becomes second nature.
Run: forge test --match-contract BitToolkitTest -vvv
π Summary
β Covered:
- Foundations: AND, OR, XOR, NOT as filter/combiner/flipper/inverter; SHL, SHR, SAR with visual bit diagrams; BYTE opcode and big-endian byte ordering; precedence traps in Solidity vs explicit nesting in assembly
- Twoβs complement: Flip-and-add-1 encoding, the asymmetric number line, why addition works for both signed and unsigned, sign extension for type widening, SIGNEXTEND opcode for assembly, SAR vs SHR on negative values
- Masking patterns: Building masks from width and offset, extract/set/clear/toggle as a system, read-modify-write as the complete pattern, four common masking mistakes
- Bitmap sets: uint256 as 256 booleans, membership/add/remove in single opcodes, set algebra (union/intersection/difference) as AND/OR, population count with parallel counting, iteration via LSB clearing, multi-word extension with mapping
- LSB & MSB: Isolate lowest set bit with
x & -x, clear lowest withx & (x-1), count trailing zeros (loop, binary search, de Bruijn), find highest set bit with halving binary search, branchless MSB in assembly - Power-of-2 tricks: isPowerOfTwo, next power of two via bit smearing, alignment rounding (floor and ceiling), modulo by AND
- Bit idioms: Branchless select/min/max/abs, XOR swap (with aliasing warning), dirty bits and cleaning patterns, multi-condition checking with bitmap flags
π Resources
EVM Reference
- EVM Opcodes β Bitwise β Interactive reference for AND, OR, XOR, NOT, SHL, SHR, SAR, BYTE, SIGNEXTEND
- Solidity Docs β Operators β Precedence table and type rules
Bit Manipulation
- Bit Twiddling Hacks (Stanford) β Classic reference for bit tricks (C-focused, but the math is universal)
- Chess Programming Wiki β Bitboards β Deep dive on bitmap sets (chess uses 64-bit bitmaps extensively)
- Hackerβs Delight (Henry Warren) β The definitive book on bit manipulation
Production Code Using These Patterns
- Solady FixedPointMathLib β log2, sqrt, mulDiv using branchless bit tricks
- OpenZeppelin BitMaps β Multi-word bitmap implementation
Navigation: β Errors | Deep Dives Overview