If this blog post seems different than usual, it’s because I’m actually using it to get tech support via Twitter for an issue I’m having. One of my tasks for my current project has me generating data for use in a test database. DBPedia is the data source, and I’ve been running SPARQL queries to retrieve RDF/XML-formatted data against their Virtuoso endpoint. For some reason though, the resulting data doesn’t validate.
For example, the following query:
PREFIX owl: <http://www.w3.org/2002/07/owl#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX foaf: <http://xmlns.com/foaf/0.1/>
PREFIX dc: <http://purl.org/dc/elements/1.1/>
PREFIX : <http://dbpedia.org/resource/>
PREFIX dbpedia2: <http://dbpedia.org/property/>
PREFIX dbpedia: <http://dbpedia.org/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
SELECT ?property ?hasValue ?isValueOf
WHERE {
{ <http://dbpedia.org/resource/Bank> ?property ?hasValue FILTER (LANG(?hasValue) = ‘en’) .}
UNION
{ ?isValueOf ?property <http://dbpedia.org/resource/Bank> }
}
generates the RDF/XML output here. If I try to parse the file with an RDF validator (like this one, for example), validation fails. Removing the attributes from the output takes care of the validation issues, but what I’m not sure of is why the node ids are added in the first place.