cacalabs
/
research


			
				
					
						
						
							
							% This is LLNCS.DEM the demonstration file of
% the LaTeX macro package from Springer-Verlag
% for Lecture Notes in Computer Science,
% version 2.2 for LaTeX2e
%
\documentclass{llncs}
%
\usepackage{makeidx}  % allows for indexgeneration
\usepackage{graphicx} % for gnuplot epslatex stuff
\usepackage{color}    % ditto
\usepackage{pstricks} % for inkscape TeX output
%
\begin{document}
%
\mainmatter              % start of the contributions
%
\title{Reinstating Floyd-Steinberg: Improved Metrics for Quality Assessment
of Error Diffusion Algorithms}
%
\titlerunning{Adapting Qualitative Metrics to Common Error Diffusion Algorithms}  % abbreviated title (for running head)
%                                     also used for the TOC unless
%                                     \toctitle is used
%
\author{Sam Hocevar\inst{1} \and Gary Niger\inst{2}}
%
\authorrunning{Sam Hocevar et al.} % abbreviated author list (for running head)
%
%%%% modified list of authors for the TOC (add the affiliations)
\tocauthor{Sam Hocevar, Gary Niger (Laboratoire d'Imagerie Bureautique et de
Conception Artistique)}
%
\institute{Laboratoire d'Imagerie Bureautique et de Conception Artistique\\
14 rue de Plaisance, Paris, France
\and
143 Rolloffle Avenue, Tarzana, California 91356\\
\email{sam@hocevar.net}, \email{gary\_niger@gnaa.us}}

\maketitle              % typeset the title of the contribution

\begin{abstract}
In this contribution we introduce a little-known property of error diffusion
halftoning algorithms which we call {\it error diffusion displacement}.
By accounting for the inherent sub-pixel displacement caused by the error
propagation, we correct an important flaw in most metrics used to assess the
quality of resulting halftones. We find these metrics to usually highly
underestimate the quality of error diffusion in comparison to more modern
algorithms such as direct binary search.
Using empirical observation, we give a method for creating computationally
efficient, image-independent, model-based metrics for this quality assessment.
Finally, we use the properties of error diffusion displacement to justify
Floyd and Steinberg's well-known choice of algorithm coefficients.

{\bf Keywords}: halftoning, error diffusion, image quality, human visual
system, color quantization
\end{abstract}
%
\section{Introduction}

Image dithering is the process of reducing continuous-tone images to images
with a limited number of available colours. Applications vary tremendously,
from laser and ink-jet printing to display on small devices such as cellphones,
or even the design of banknotes.

Countless methods have been published for the last 40 years that try to best
address the problem of colour reduction. Comparing two algorithms in terms of
speed or memory usage is often straightforward, but how exactly a halftoning
algorithm performs quality-wise is a far more complex issue, as it highly
depends on the display device and the inner workings of the human eye.

Though this document focuses on the particular case of bilevel halftoning,
most of our results can be directly adapted to the more generic problem of
colour reduction.

\section{Halftoning algorithms}

The most ancient halftoning method is probably classical screening. This highly
parallelisable algorithm consists in tiling a dither matrix over the image
and using its elements as threshold values. Classical screening is known for
its structural artifacts such as the cross-hatch patterns caused by Bayer
ordered dither matrices \cite{bayer}. However, modern techniques such as the
void-and-cluster method \cite{void1}, \cite{void2} allow to generate screens
yielding visually pleasing results.

\medskip Error diffusion dithering, introduced in 1976 by Floyd and Steinberg
\cite{fstein}, tries to compensate for the thresholding error through the use
of feedback. Typically applied in raster scan order, it uses an error diffusion
matrix such as the following one, where $x$ denotes the pixel being processed:

\[ \frac{1}{16} \left| \begin{array}{ccc}
- & x & 7 \\
3 & 5 & 1 \end{array} \right| \]

Though efforts have been made to make error diffusion parallelisable
\cite{parfstein}, it is generally considered more computationally expensive
than screening, but carefully chosen coefficients yield good visual results
\cite{kite}.

\medskip Model-based halftoning is the third important algorithm category. It
relies on a model of the human visual system (HVS) and attempts to minimise
an error value based on that model. One such algorithm is direct binary seach
(DBS) \cite{allebach}, also referred to as least-squares model-based halftoning
(LSMB) \cite{lsmb}.

HVS models are usually low-pass filters. Nasanen \cite{nasanen}, Analoui and
Allebach found that using Gaussian models gave visually pleasing results, an
observation confirmed by independent visual perception studies \cite{mcnamara}.

DBS yields halftones of impressive quality. However, despite efforts to make
it more efficient \cite{bhatt}, it suffers from its large computational
requirements and error diffusion remains a more widely used technique.

\section{Error diffusion displacement}

Most error diffusion implementations parse the image in raster scan order.
Boustrophedonic (serpentine) scanning has been shown to cause fewer visual
artifacts \cite{halftoning}, but other, more complex processing paths such as
Hilbert curves \cite{spacefilling} are seldom used as they do not improve the
image quality significantly.

Intuitively, as the error is always propagated to the bottom-left or
bottom-right of each pixel (Fig. \ref{fig:direction}), one may expect the
resulting image to be slightly translated. This expectation is confirmed
visually when rapidly switching between an error diffused image and the
corresponding DBS halftone.

\begin{figure}
  \begin{center}
    \input{direction}
    \caption{Floyd-Steinberg error diffusion direction in raster scan (left)
             and serpentine scan (right).}\label{fig:direction}
  \end{center}
\end{figure}

This small translation is visually innocuous but we found that it means a lot
in terms of error computation. A common way to compute the error between an
image $h_{i,j}$ and the corresponding binary halftone $b_{i,j}$ is to compute
the mean square error between modified versions of the images, in the form:

\begin{equation}
  E(h,b) = \frac{(||v * h_{i,j} - v * b_{i,j}||_2)^2}{wh}
\end{equation}

\noindent where $w$ and $h$ are the image dimensions, $*$ denotes the
convolution and $v$ is a model for the human visual system.

To compensate for the slight translation observed in the halftone, we use the
following error metric instead:

\begin{equation}
  E_{dx,dy}(h,b) = \frac{(||v * h_{i,j} - v * t_{dx,dy} * b_{i,j}||_2)^2}{wh}
\end{equation}

\noindent where $t_{dx,dy}$ is an operator which translates the image along the
$(dx,dy)$ vector. By design, $E_{0,0} = E$.

A simple example can be given using a Gaussian HVS model:

\begin{equation}
  v(x,y) = e^{\frac{x^2+y^2}{2\sigma^2}}
\end{equation}

Finding the second filter is then straightforward:

\begin{equation}
  (v * t_{dx,dy})(x,y) = e^{\frac{(x-dx)^2+(y-dy)^2}{2\sigma^2}}
\end{equation}

Experiments show that for a given image and a given corresponding halftone,
$E_{dx,dy}$ has a local minimum almost always away from $(dx,dy) = (0,0)$ (Fig.
\ref{fig:lena-values}). Let $E$ be an error metric where this remains true. We
call the local minimum $E_{min}$:

\begin{equation}
  E_{min}(h,b) = \min_{dx,dy}E_{dx,dy}(h,b)
\end{equation}

\begin{figure}
  \begin{minipage}[c]{0.8\textwidth}
    \input{lena-values}
  \end{minipage}
  \begin{center}
    \caption{Mean square error for the \textit{Lena} image ($\times10^4$). $v$
             is a simple $11\times11$ Gaussian convolution kernel with $\sigma
             = 1.2$ and $(dx,dy)$ vary in $[-1,1]\times[-1,1]$.}
    \label{fig:lena-values}
  \end{center}
\end{figure}

For instance, a Floyd-Steinberg dither of \textit{Lena} with $\sigma = 1.2$
yields a per-pixel mean square error of $3.67\times10^{-4}$. However, when
taking the displacement into account, the error becomes $3.06\times10^{-4}$ for
$(dx,dy) = (0.165,0.293)$. The new, corrected error is significantly smaller,
with the exact same input and output images.

Experiments show that the corrected error is always noticeably smaller except
in the case of images that are already mostly pure black and white. The
experiment was performed on a database of 10,000 images from common computer
vision sets and from the image board \textit{4chan}, providing a representative
sampling of the photographs, digital art and business graphics widely exchanged
on the Internet nowadays \cite{4chan}.

In addition to the classical Floyd-Steinberg and Jarvis-Judice-Ninke kernels,
we tested two serpentine error diffusion algorithms: Ostromoukhov's simple
error diffusion \cite{ostromoukhov}, which uses a variable coefficient kernel,
and Wong and Allebach's optimum error diffusion kernel \cite{wong}:

\begin{center}
  \begin{tabular}{|l|c|c|}
  \hline
  &~ $E\times10^4$ ~&~ $E_{min}\times10^4$ ~\\ \hline
  ~raster Floyd-Steinberg ~&~ 3.7902 ~&~ 3.1914 ~\\ \hline
  ~raster Ja-Ju-Ni        ~&~ 9.7013 ~&~ 6.6349 ~\\ \hline
  ~Ostromoukhov           ~&~ 4.6892 ~&~ 4.4783 ~\\ \hline
  ~optimum kernel         ~&~ 7.5209 ~&~ 6.5772 ~\\
  \hline
  \end{tabular}
\end{center}

We clearly see that usual metrics underestimate the quality of error-diffused
halftones, especially in raster scan. Algorithms such as direct binary search,
on the other hand, do not suffer from this bias since they are designed to
minimise the very error induced by the HVS model.

\section{An image-independent corrected quality metric for error-diffused
halftones}

We have seen that for a given image, $E_{min}(h,b)$ is a better and fairer
visual error measurement than $E(h,b)$. However, its major drawback is that it
is highly computationally expensive: for each image, the new $(dx,dy)$ values
need to be calculated to minimise the error value.

Fortunately, we found that for a given raster or serpentine scan
error diffusion algorithm, there was often very little variation in
the optimal $(dx,dy)$ values (Fig. \ref{fig:table-historaster} and
\ref{fig:table-histoserp}).

\begin{figure}
  \begin{center}
    \begin{minipage}[c]{0.50\textwidth}
      \input{fs-histo}
    \end{minipage}
    \begin{minipage}[c]{0.40\textwidth}
      \input{jajuni-histo}
    \end{minipage}
    \caption{error diffusion displacement histograms for the raster
             Floyd-Steinberg (left) and raster Jarvis, Judis and Ninke (right)
             algorithms applied to a corpus of 10,000 images}
    \label{fig:table-historaster}
  \end{center}
\end{figure}

\begin{figure}
  \begin{center}
    \begin{minipage}[c]{0.50\textwidth}
      \input{ostro-histo}
    \end{minipage}
    \begin{minipage}[c]{0.40\textwidth}
      \input{serpopt-histo}
    \end{minipage}
    \caption{error diffusion displacement histograms for the Ostromoukhov                    (left) and optimum kernel (right) algorithms applied to a corpus
             of 10,000 images}
    \label{fig:table-histoserp}
  \end{center}
\end{figure}

For each algorithm, we choose the $(dx,dy)$ values at the histogram peak and
we refer to them as the \textit{algorithm's displacement}, as opposed to the
\textit{image's displacement} for a given algorithm. We call $E_{fast}(h,b)$
the error computed at $(dx,dy)$. As $E_{fast}$ does not depend on the image, it
is a lot faster to compute than $E_{min}$, and as it is statistically closer to
$E_{min}$, we can expect it to be a better error estimation than $E$:

\begin{center}
  \begin{tabular}{|l|c|c|c|c|c|}
  \hline
  &~ $E\times10^4$ ~&~ $E_{min}\times10^4$ ~&~ $dx$ ~&~ $dy$ ~&~ $E_{fast}\times10^4$ ~\\ \hline
  ~raster Floyd-Steinberg ~&~ 3.7902 ~&~ 3.1914 ~&~ 0.16 ~&~ 0.28 ~&~ 3.3447 ~\\ \hline
  ~raster Ja-Ju-Ni        ~&~ 9.7013 ~&~ 6.6349 ~&~ 0.26 ~&~ 0.76 ~&~ 7.5891 ~\\ \hline
  ~Ostromoukhov           ~&~ 4.6892 ~&~ 4.4783 ~&~ 0.00 ~&~ 0.19 ~&~ 4.6117 ~\\ \hline
  ~optimum kernel         ~&~ 7.5209 ~&~ 6.5772 ~&~ 0.00 ~&~ 0.34 ~&~ 6.8233 ~\\
  \hline
  \end{tabular}
\end{center}

\section{Using error diffusion displacement for optimum kernel design}

We believe that our higher quality $E_{min}$ error metric may be useful in
kernel design, because it is the very same error that admittedly superior yet
computationally expensive algorithms such as DBS try to minimise.

Our first experiment was a study of the Floyd-Steinberg-like 4-block error
diffusion kernels. According to the original authors, the coefficients were
found "mostly by trial and error" \cite{fstein}. With our improved metric, we
now have the tools to confirm or infirm Floyd and Steinberg's initial choice.

We chose to do an exhaustive study of every $\frac{1}{16}\{a,b,c,d\}$ integer
combination. We deliberately chose positive integers whose sum was 16: error
diffusion coefficients smaller than zero or adding up to more than 1 are known
to be unstable \cite{stability}, and diffusing less than 100\% of the error
causes important loss of detail in the shadow and highlight areas of the image.

We studied all possible coefficients on a pool of 3,000 images with an error
metric $E$ based on a standard Gaussian HVS model. $E_{min}$ is only given here
as an indication and only $E$ was used to elect the best coefficients:

\begin{center}
  \begin{tabular}{|c|c|c|c|}
  \hline
  ~ rank ~&~ coefficients ~&~ $E\times10^4$ ~&~ $E_{min}\times10^4$ ~\\ \hline
  ~ 1 ~&~ 7 3 6 0 ~&~ 4.65512 ~&~ 3.94217 ~\\ \hline
  ~ 2 ~&~ 8 3 5 0 ~&~ 4.65834 ~&~ 4.03699 ~\\ \hline
  \hline
  ~ 5 ~&~ 7 3 5 1 ~&~ 4.68588 ~&~ 3.79556 ~\\ \hline
  \hline
  ~ 18 ~&~ 6 3 5 2 ~&~ 4.91020 ~&~ 3.70465 ~\\ \hline
  ~ \dots ~&~ \dots ~&~ \dots ~&~ \dots ~\\
  \hline
  \end{tabular}
\end{center}

The exact same operation using $E_{min}$ as the decision variable yields very
different results. Similarly, $E$ is only given here as an indication:

\begin{center}
  \begin{tabular}{|c|c|c|c|}
  \hline
  ~ rank ~&~ coefficients ~&~ $E_{min}\times10^4$ ~&~ $E\times10^4$ ~\\ \hline
  ~ 1 ~&~ 6 3 5 2 ~&~ 3.70465 ~&~ 4.91020 ~\\ \hline
  ~ 2 ~&~ 7 3 5 1 ~&~ 3.79556 ~&~ 4.68588 ~\\ \hline
  \hline
  ~ 15 ~&~ 7 3 6 0 ~&~ 3.94217 ~&~ 4.65512 ~\\ \hline
  \hline
  ~ 22 ~&~ 8 3 5 0 ~&~ 4.03699 ~&~ 4.65834 ~\\ \hline
  ~ \dots ~&~ \dots ~&~ \dots ~&~ \dots ~\\
  \hline
  \end{tabular}
\end{center}

Our improved metric allowed us to confirm that the original Floyd-Steinberg
coefficients were indeed amongst the best possible for raster scan.
More importantly, using $E$ as the decision variable may have elected
$\frac{1}{16}\{7,3,6,0\}$ or $\frac{1}{16}\{8,3,5,0\}$, which are in fact poor
choices.

For serpentine scan, however, our experiment suggests that
$\frac{1}{16}\{7,4,5,0\}$ is a better choice than the Floyd-Steinberg
coefficients that have nonetheless been widely in use so far (Fig.
\ref{fig:lena7450}).

\begin{figure}
  \begin{center}
    \includegraphics[width=0.4\textwidth]{output-7-3-5-1-serp.eps}
    ~
    \includegraphics[width=0.4\textwidth]{output-7-4-5-0-serp.eps}
  \end{center}
  \begin{center}
    \includegraphics[width=0.4\textwidth]{crop-7-3-5-1-serp.eps}
    ~
    \includegraphics[width=0.4\textwidth]{crop-7-4-5-0-serp.eps}
    \caption{halftone of \textit{Lena} using serpentine error diffusion
             (\textit{left}) and the optimum coefficients
             $\frac{1}{16}\{7,4,5,0\}$ (\textit{right}) that improve on the
             standard Floyd-Steinberg coefficients in terms of visual quality
             for the HVS model used in section 3. The detailed area
             (\textit{bottom}) shows fewer structure artifacts in the regions
             with low contrast.}
    \label{fig:lena7450}
  \end{center}
\end{figure}

\section{Conclusion}

We have disclosed an interesting property of error diffusion algorithms
allowing to more precisely measure the quality of such halftoning methods.
Having showed that such quality is often underestimated by usual metrics,
we hope to see even more development in simple error diffusion methods.

Confirming Floyd and Steinberg's 30-year old "trial-and-error" result with our
work is only the beginning: future work may cover more complex HVS models,
for instance by taking into account the angular dependance of the human eye
\cite{sullivan}. We plan to use our new metric to improve all error diffusion
methods that may require fine-tuning of their propagation coefficients.

%
% ---- Bibliography ----
%
\begin{thebibliography}{}
%
\bibitem[1]{bayer}
B. Bayer,
\textit{Color imaging array}.
U.S. patent 3,971,065 (1976)

\bibitem[2]{void1}
R.A. Ulichney (Digital Equipment Corporation),
\textit{Void and cluster apparatus and method for generating dither templates}.
U.S. patent 5,535,020 (1992)

\bibitem[3]{void2}
H. Ancin, A. Bhattacharjya and J. Shu (Seiko Epson Corporation),
\textit{Void-and-cluster dither-matrix generation for better half-tone
uniformity}.
U.S. patent 6,088,512 (1997)

\bibitem[4]{fstein}
R.W. Floyd, L. Steinberg,
\textit{An adaptive algorithm for spatial grey scale}.
Proceedings of the Society of Information Display 17, (1976) 75--77

\bibitem[5]{parfstein}
P. Metaxas,
\textit{Optimal Parallel Error-Diffusion Dithering}.
Color Imaging: Device-Indep. Color, Color Hardcopy, and Graphic Arts IV, Proc.
SPIE 3648, 485--494 (1999)

\bibitem[6]{kite}
T. D. Kite,
\textit{Design and Quality Assessment of Forward and Inverse Error-Diffusion
Halftoning Algorithms}.
PhD thesis, Dept. of ECE, The University of Texas at Austin, Austin, TX, Aug.
1998

\bibitem[7]{halftoning}
R. Ulichney,
\textit{Digital Halftoning}.
MIT Press, 1987

\bibitem[8]{spacefilling}
L. Velho and J. Gomes,
\textit{Digital halftoning with space-filling curves}.
Computer Graphics (Proceedings of SIGGRAPH 91), 25(4):81--90, 1991

\bibitem[9]{nasanen}
R. Nasanen,
\textit{Visibility of halftone dot textures}.
IEEE Trans. Syst. Man. Cyb., vol. 14, no. 6, pp. 920--924, 1984

\bibitem[10]{allebach}
M. Analoui and J.~P. Allebach,
\textit{Model-based halftoning using direct binary search}.
Proc. of SPIE/IS\&T Symp. on Electronic Imaging Science and Tech.,
February 1992, San Jose, CA, pp. 96--108

\bibitem[11]{mcnamara}
Ann McNamara,
\textit{Visual Perception in Realistic Image Synthesis}.
Computer Graphics Forum, vol. 20, no. 4, pp. 211--224, 2001

\bibitem[12]{bhatt}
Bhatt \textit{et al.},
\textit{Direct Binary Search with Adaptive Search and Swap}.
\url{http://www.ima.umn.edu/2004-2005/MM8.1-10.05/activities/Wu-Chai/halftone.pdf}

\bibitem[13]{4chan}
moot,
\url{http://www.4chan.org/}

\bibitem[14]{wong}
P.~W. Wong and J.~P. Allebach,
\textit{Optimum error-diffusion kernel design}.
Proc. SPIE Vol. 3018, p. 236--242, 1997

\bibitem[15]{ostromoukhov}
Victor Ostromoukhov,
\textit{A Simple and Efficient Error-Diffusion Algorithm}.
in Proceedings of SIGGRAPH 2001, in ACM Computer Graphics,  Annual Conference
Series, pp. 567--572, 2001

\bibitem[16]{lsmb}
T.~N. Pappas and D.~L. Neuhoff,
\textit{Least-squares model-based halftoning}.
in Proc. SPIE, Human Vision, Visual Proc., and Digital Display III, San Jose,
CA, Feb. 1992, vol. 1666, pp. 165--176

\bibitem[17]{stability}
R. Eschbach, Z. Fan, K.~T. Knox and G. Marcu,
\textit{Threshold Modulation and Stability in Error Diffusion}.
in Signal Processing Magazine, IEEE, July 2003, vol. 20, issue 4, pp. 39--50

\bibitem[18]{sullivan}
J. Sullivan, R. Miller and G. Pios,
\textit{Image halftoning using a visual model in error diffusion}.
J. Opt. Soc. Am. A, vol. 10, pp. 1714--1724, Aug. 1993

\end{thebibliography}

\end{document}