Re: [ng-spice-devel] ACS
---"Alan" == Alan Gillespie <alan.gillespie@analog.com> writes:
Alan> Secondly, they mention something about being able to
Alan> reduce the numerical error by iterating after the direct
Alan> solve.
I haven't looked lately, but they're probably referring to iterative
improvement.
Alan> I was wondering if using single precision for the
Alan> direct solution, and optimising it for the new 3D-Now/SSIMD
Alan> instrucions, and doing a few iterations afterwards, would
Alan> lead to a faster overall solution. (3D-Now and SSIMD don't
Alan> do double precision yet, as far as I'm aware.) Using 32 bit
Alan> variables would presumably allow bigger arrays into the
Alan> small cache, as well.
I don't think this is a good idea, but it might be interesting to test
if you have time.
The basic idea with iterative improvement is to reduce the residual
associated with Ax-b, to try to offset the effects of error growth. I
think, if one is lucky and numerical ill-conditioning isn't fatal, it
is possible to get some more clean bits in your solution. But it isn't
going to be possible to make a 32-bit solution "expand" to be as good
as a 64-bit solution -- iterative improvement would just give some more
good bits in the existing 32-bit or 64-bit solution.
We've seen some real circuits where a 64-bit solution isn't good
enough. (It isn't too hard to contrive an example where 64 bits isn't
enough. 64 bits is just 15-ish decimal digits; a circuit where gmin at
1e-12 is fighting with milli-ohm resistors is starting to get ugly.)
Iterative improvement might be helpful for these cases. I've also seen
a paper from Bell Labs where they used a least-squares solution for
these nasty situations.
Some ideas on 32-bit arithmetic were tried, with some twists, in
mini-MSINC, a minicomputer-based simulator of the 70s at Stanford. As
I vaguely remember, the results weren't all that great. Seems like
they were trying to use an indefinite admittance matrix, keeping most
of the matrix single-precision, but accumulating the RHS and the extra
matrix row in double-precision. As long as the iteration converges,
the accuracy of the Jacobian matrix doesn't matter, but it _will_ slow
convergence.
Alan> What size of cache and circuit did you test it with ?
Oh, I'm sure we used some Sparc with a relatively large cache, most
likely. The circuits were probably some internal netlists, but the old
MCNC benchmarks can generate large enough matrices to be interesting.
Regards,
--Steve
Partial thread listing:
- Re: [ng-spice-devel] ACS, (continued)
Al Davis