Data management

Joins are more expensive than a pure concatenation. In your case maybe UNION or UNION ALL might be a good solution because it parses the first SELECT statement and moves on to the next SELECT statement and adds the results to the end of an output table. The finding of the different connections of the joined tables is a huge cost factor.

Using json or yaml file structures to map your unstructured data would help you significantly, also another solution would be to use a NoSQL type database like MongoDB the third option would be using an online web service for your data like the Amazon Web Services S3 simple storage service

The mainly difference, which help to decide the selecting decision for the software design is the embedded mode. H2 supports this kind of the mode while PostgreSQL doesn't support it. Further H2 is platform independent and mostly used for the small application and PostgreSQL only supports some kind of the OS.

The question of managing unstructured data is very broad and impossible to answer with a definitive correct answer. NoSQL is different than relational databases but the data have some kind of structure. This structure is different than traditional relational databases but it is not correct to call them unstructured.

There are seems to be also some tools to visualize data by accessing to a SPARQL endpoint (e.g. Apache Jena Fuseki provides the possibility to have a SPARQL endpoint, in addition to publicly available SPARQL endpoints, if that's of an interest). This might be a bit off-topic, but I hope at least related to your challenge :)

NoSQL is a whole new way of thinking about a database. NoSQL is not a relational database. The reality is that a relational database model may not be the best solution for all situations. The easiest way to think of NoSQL, is that of a database which does not adhering to the traditional relational database management system (RDMS) structure.

Taggings:

One of the useful tool for visualization of RDF data is WebVOWL - http://visualdataweb.de/webvowl/. It supports RDF data in XML as well as in Turtle (TTL) formats and allows complex visualization of data.

Technology:

Apparently sparql doesn't allow non aggregate variables in the select clause unless they are also in the group by, which affects the query's meaning. So I came up with another solution which includes an intersection(join):

select ?country ?area ?lake
where {
?lake rdfs:label ?label .
?lake rdf:type dbo:Lake .
?lake dbo:areaTotal ?area .
?lake dbo:country ?country .
?country rdf:type dbo:Country .
FILTER (lang(?label) = 'en') .
FILTER(?maxarea = ?area)
{
select ?country (MAX(?area) AS ?maxarea)
where {
?lake rdfs:label ?label .
?lake rdf:type dbo:Lake .
?lake dbo:areaTotal ?area .
?lake dbo:country ?country .
?country rdf:type dbo:Country .
FILTER (lang(?label) = 'en') .
}
group by ?country
}
}

Technology:

Subscribe to Data management