BACKGROUND: In vitro diagnostic medical devices or 'test kits' are important tools for diagnosing illnesses and monitoring the patients' state of health or the effects of medical treatment. It is in the public interest that commercial kits are of high quality. OBJECTIVES: To review the current situation of kit evaluation in Europe and to establish practical and sensible criteria for kit evaluation in general, based on the Swiss experience with the licencing of kits for the detection of human immunodeficiency virus (HIV) infection. Kit evaluation should provide the user with a selection of sufficiently good products. In addition, it should be simple and not require rare patient materials. Also of importance is that all manufacturers are treated equally. METHODS: The evaluation criteria depend on the legal situation, the intended use of the test, the state of the art and the graveness of false-negative or -positive results. In addition to diagnostic sensitivity and specificity, other important considerations include the quality control by the manufacturer, kit presentation, intra- and inter-lot reproducibility and how a test fits into a country's adopted testing system. Results: The above goals can be achieved by setting minimal performance standards, preferably in the form of a lower limit of the confidence interval for the sensitivity/specificity that has been observed in the test's field evaluation. The better the test, the fewer samples have to be studied. For tests with no false results observed during field trials, the minimum sample number necessary for demonstration of a sensitivity or specificity >/= 99% is about 400, while it is 700 for a >/= 99.5% demonstration. Further insight into sensitivity is provided by results from well described and frequently used commercial seroconversion panels. This represents a dynamic tool that readily identifies insensitive kits that have become technically obsolete. Specificity assessment should not only involve blood donors, but also appropriate clinical controls. It is clear that the test design has an important bearing with respect to sensitivity and specificity. There are inherent advantages and disadvantages of indirect binding assays, capture assays and double antigen sandwich assays. Positive or negative results with these different test formats may not carry exactly the same message. CONCLUSION: Efficient kit evaluation does not require tens of thousands of samples but can be performed with relatively little disbursement on materials and manpower.