I guess, a really fast solution on a modern CPU looks like this:
int mask = 0x2AAAAAAA; // special handling for the sign bit
int diff = Integer.bitCount(x & mask) - Integer.bitCount(x & ~mask);
Don't be fooled by the complicated code behind it, it's an intristic and translates to a single cycle instruction.
Now we have a number between about -16 and 16, which is divisible by 3 iff x is divisible by 3. The fastest way is probably a table packed into a single long.
return (table << diff) < 0;