The Matrix Market File Format


Code and Libraries


Code for reading and analyzing graphs in mtx format (as well as many other graph formats):

Note that the graph readers in PMC and PGD are quite robust, and capable of reading and inferring the format type of the graph in many cases. Also, the mtx readers in those libraries are able to read mtx files that may violate one or more of the MTX file characteristics.

Note comments are denoted by at least one %. The header line is usually the first line that begins with at least one % and in many cases two %%, followed by MatrixMarket and in most cases four fields that describe the data stored. In general, the first line to appear without % represents N M K where N is the number of rows, M is the number of columns, and K is the number of nonzeros in the matrix. For undirected graphs, the first line to appear without % represents N N M where N is the number of nodes and M is the number of edges. For instance, the first line above is 4 4 6 and indicates the number of nodes is N=4 and number of edges is M=6.

A graph file with the extension .mtx is read (by PGD and PMC above) using this (somewhat) strict mtx graph reader. Thus, if the graph file does not strictly follow the above mtx format (e.g., if the graph is an edge list, without the header line or the line that encodes the number of rows, columns, and nonzeros, then the file extension should be changed to allow it to be read by the more flexible graph reader discussed below.




MM/MTX File Format


A file in the Matrix Market format comprises four parts:

  1. Header line: contains an identifier, and four text fields;
  2. Comment lines: allow a user to store information and comments;
  3. Size line: specifies the number of rows and columns, and the number of nonzero elements;
  4. Data lines: specify the location of the matrix entries (implicitly or explicitly) and their values.

The header line has the form

%MatrixMarket object format field symmetry

or

%MatrixMarket object format field symmetry

The header line must be the first line of the file, and the header line must begin with the string %MatrixMarket or %%MatrixMarket. The four fields that follow that string are

If the field of a matrix is pattern, then only the locations of the nonzeros will be listed. This presumes, obviously, that we are using the coordinate format!

If the symmetry of a matrix is symmetric or hermitian, then only the entries on or below the main diagonal are to be listed. If the symmetry is skew-symmetric, then only the entries strictly below the main diagonal are to be listed.

The comment lines, if any, should follow the header line. The only requirement is that each comment line begin with a percent sign.

If format was specified as array, then the size line has the form:

        m n
      
where

If format was specified as coordinate, then the size line has the form:

        m n nonzeros
      
where

If format was specified as array, there must follow exactly m * n data lines, one for each entry, listed by columns, having the form

        value
      
where

If format was specified as coordinate, there must follow exactly nonzeros data lines, one for each matrix entry that is to be listed, having the form

        i j value
      
where