Re: [ng-spice-devel] Ng-spice and SMP was: Re: [ng-spice-devel] Catching up


To ng-spice-devel@ieee.ing.uniroma1.it
From Paolo Nenzi <pnenzi@ieee.ing.uniroma1.it>
Date Sun, 18 Feb 2001 20:34:48 +0100 (CET)
Delivered-To mailing list ng-spice-devel@ieee.ing.uniroma1.it
In-Reply-To <0102111055210V.01056@hobbes >
Mailing-List contact ng-spice-devel-help@ieee.ing.uniroma1.it; run by ezmlm
Reply-To ng-spice-devel@ieee.ing.uniroma1.it



On Sun, 11 Feb 2001, Al Davis wrote:

> On Sun, 11 Feb 2001, Paolo Nenzi wrote:
> > > The biggest difference is likely to be in how to interact with
> > > it. This is why ACS uses its own sparse matrix package.
> >
> > Al, can you tell somenthing about the diffrences between SParse and
> > ACS matrix package ?
> 
> Here are a few differences:
> 
> 
> 
> The ACS matrix package makes assumptions about the pattern of the 
> data stored in it.  If these assumptions are not valid, you get less 
> than optimal performance.  Sparse makes no such assumptions.  
> Instead, it attempts to reorder in the general case.  This is one of 
> the reasons why the ACS package is smaller.
> 
> The ACS matrix package does not reorder the matrix.  It expects it to 
> be properly ordered from the beginning, based on the assumption that 
> the caller knows more about the data than it does.  Sparse reorders 
> the matrix, using the Markowitz algorithm, which in my tests (over 10 
> years ago) almost always produced worse ordering than I could do 
> manually.  


Ok, you said manually, this means circuit inspection or there is some
euristhics o teory that preorders the matrix; I mean, a set of rules that
says which rows to exchange and how. 


> 
> The ACS matrix package uses a vector representation, so that the 
> innermost loop (most speed critical) is a vector operation, 
> minimizing the cache misses in the inner loop.  Sparse uses linked 
> lists.

This sounds very interesting. I think that this will enanche further 
the ACS speed on machines with SIMD fpu (if any in the cheap market).

> The ACS matrix package precomputes the parameters needed for 
> allocation, then allocates the entire matrix in a manner that tries 
> to keep it on a single memory page if possible.  Sparse uses multiple 
> allocations, making it more likely to get cache misses.

Again this is interesting. On the cache misses, what about inserting
prefetch asm instructions shortly before loops to load blocks of data
in the cache and thus further enhance the speed ?

 
> The ACS matrix package allows you to make incremental changes to the 
> matrix, then solves only the parts of LU than are changed as a result 
> of the original change.  This means that it is not necessary to 
> rebuild the matrix for every iteration.  Sparse requires you to 
> rebuild and re-solve the whole thing.

Are docs available on this ?
 
> As a side effect of the algorithm that lets you solve parts of the 
> matrix, the ACS matrix package finds hinge points in the matrix, 
> which should make it really easy to partition the matrix for solution 
> on parallel processors.   Just break it everywhere that basenode[x] 
> == x.  Global Markowitz ordering would destroy this ability to have 
> hinge points.

You may contribute to the project publishing references of papers you
used (or wrote) to develop the sparse matrix solver ? I will publish a new
section of the site with the new simulator.


Paolo


Partial thread listing: