A large portion of the useful information on the web is in the form of unstructured natural language documents. Currently such documents are understandable to humans but not to software agents. One of the goals of the Semantic Web activity is to enrich a considerable number of web documents with annotations, which will then allow new generation search engines and novel web services to access those documents in a more intelligent fashion than currently possible. Currently the most reliable method of providing such semantic markup is via manual annotation, possibly based on predefined ontologies and with the support of specialized editors. In this paper we propose an approach for the automatic processing of textual documents to be published on the web, which can be used to automatically
generate (some of) the semantic annotations. In particular, we focus on detecting the entities mentioned in the documents, their roles and relationships to other entities.