The 486 "assume not taken" strategy (also common in other in-order cores) is not a branch prediction mechanism as such. It can be interpreted as one (in that it's essentially equivalent to predicting all branches not taken), but first and foremost it's simply what happens when the instruction fetch stage just keeps going through memory sequentially unless redirected.
In other words, it's what naturally happens in a pipelined processor in the _absence_ of any intentionally designed branch prediction mechanism.
As for static vs. dynamic prediction, to be making explicit static predictions, you actually need to know whether you have seen a given branch before, and many branch predictors don't! That is, they simply have no way to tell reliably whether a given branch history (or branch target) entry is actually for the branch it's trying to predict. The various branch prediction structures are mostly just bags of bits indexed by a hash code.
Now some do have a few tag bits that allow them to detect collisions, probabilistically with high likelihood anyway, but others don't. It's just a matter of whether the architects think it's worthwhile to spend the area on tag bits, or whether they have another thing they'd rather do that is a better trade-off for their target criteria.
Netburst (Pentium 4) is an anomaly, AFAIK because of the way the branch prediction interacts with the trace cache. Namely, I believe that branch history etc. in Netburst was kept at the trace level. For trace cache entries, unlike branch history/target entries, there is a well-defined moment when they're new, namely when they're initialized after a trace cache miss (trace cache miss detection _has_ to do full tagging and be precise for the CPU to work); that's an opportunity to set up the new trace with the static prediction heuristic, and also branch hints. [But I'm no expert on this, and I bet there are Intel folks on this list that could be more precise.]
Zooming out a bit, real designs don't generally have a single monolithic branch prediction mechanism; they have lots of them, in different places in the pipeline. Going roughly in pipeline order, some common mechanisms are:
1. Next I$ fetch line prediction. The address for the next instruction fetch needs to be determined before the current fetch even completes, for there to be no bubbles.
2. Early branch target prediction (indexed by instruction fetch address); instruction fetch is usually multiple cycles, and again you need to know the likely address to fetch next before you've even seen any instruction bytes or know where branches are!
3. A later, more precise target/direction predictor. The early predictors are "quick and dirty", and not very accurate. In parallel, you start working on a more accurate prediction; if the late predictor disagrees with the early predictor, you need to squash a few cycles worth of instructions, but this is still fairly early in the pipeline, so a lot cheaper than full mispredictions.
4. A return stack predictor for function returns, the most common type of indirect branch by far. Tracks function calls and returns and supplies what it thinks are likely targets for return statements.
5. A separate predictor for other indirect branch targets (e.g. through virtual or jump tables). BTBs can cover some of this but dedicated indirect predictors work better. Again, an indirect predictor can decide than an earlier-guessed branch target was bogus and initiate a partial flush.
5. There are sometimes also special loop predictors to catch counted loops that (almost) always execute the same number of iterations. Innermost loop nests are usually covered well by the regular global branch history, but loop predictors help with outer loop nests that are periodic on longer time scales than the global branch history captures.
6. Once instructions are decoded, branch target addresses (for direct branches) are verified. Branch target addresses in the BTBs and other structures are commonly stored with a reduced number of bits, and in certain (hopefully rare) cases the "decompressed" address will be incorrect. Again, if this is detected, a partial flush is initiated; we're now several cycles downstream from initial instruction fetch, but this is still far from a full branch misprediction penalty.
CPU vendors are usually close-mouthed about what exactly they're doing (it's some of their secret sauce), but some are more open than others. One not-too-old CPU design that has a relatively thorough description of the branch prediction architecture (though not the nitty-gritty details) is the AMD Jaguar (low-power x86):
Section 2.7.1 in the PDF has a (very) brief description of the various branch prediction mechanisms.