Stratigraphy.net internals: July 2008

Tuesday, July 29, 2008

Cruzified by Cuil

I just visited cuil (self description: the 'world biggest search engine') and was searching for TaxonConcept to see if the site is already indexed.
Besides some TDWD WIKI entries on the TaxonConcept Scheme, cuil showed links to some paleonet newsgroup posts where I mentioned TaxonConcept. No link to TaxonConcept or Snet which is a bit disappointing but not surprising for a search engine start up.

The real surprise however was the thumbnail which cuil displayed (for whatever reason) beneath the link: A cruzifixion scene! Hmm.. does this mean something ?

Friday, July 18, 2008

Analysis of Author Networks in Wikipedia

The Social Sciences Department of the J.W. Goethe Univsersity Frankfurt sent out a press release on idw about a nice piece of work by Christian Stegbauer, a professor of social science at this department.

In his work, Stegbauer analysed the network of Wikipedia authors and their contributions to Wikipedia topic discussions on philosophy. The network analysis - or graph analysis - showed the social network of Wikipedia authors, how they interact and how they are connected through common topics.

In their graph analysis Stegbauer et al. used an approach very similar to TaxonRank. Of course, the same questions that Stegbauer asked about Wikipedia author networks could also be applied to taxonomists.

Thursday, July 17, 2008

Geology-related Reality TV

In his Arizona geologist blog Lee Allison featured a reality TV show on drilling for oil. "Black Gold - 2 Miles Deep or 6 Feet Under". Sounds dramatic, doesn't it? Is there any geology footage on YouTube? I haven't checked for that yet.

Wednesday, July 16, 2008

Fearless geologists

Stefan just sent me a scanned page of the Johannesburg Star, containing a very funny story about a (fictitious) 'Survivor' like tv show: Some geologists were sent to a very active volcano and the winner would be the 'hard-core' remaining geologist. I found some pdf scans of this article also on Chris Rowan's blog article Surviver: Geologists where you can read the whole story.

Regarding fearless geologists I also found MJC Rocks article A Carnival of Death-Defying Geologists ....

And there is this an article on Uncylcopedia on Geologists which is also very funny and worth of reading it ;)

Well, and I simply could not resist to post this picture showing 5 young fearless geologists jumping from a very high dune in central Iran - also sent to me by Stefan, who saved my day;)

Tuesday, July 8, 2008

Geoparser

Today I was scanning the web for tools which are able to scan documents and identify locations or coordinates (which we'll need to reach of our ultimate goal of a 4D (space and time) index and search engine ;) ) and found Rod Page's interesting article: iPhylo: From PDFs to Google Earth.
He offers a online service probably based on some regular expessions?, which is able to extract coordinates from pdf files and returns KML or JSON files. A simple and pragmatic approach. Cool!

I also found some geoparser tools which are able to identify location names in texts. The most interesting is Metacartas geoparser API which seems to give good results. Metacartas internet pages offer some impressive examples on how this API can be used.
Another geoparser is DIGMAP's text mining service which returns some OGC compliant XML file containing all found (not only geographic) features.
And there is MEDINA's geoxwalk which seems to be restricted to the british islands. However, I could not test this tool: the mentioned site only offers a screenshot and some pdf documnents on this tool.

Metacarta's geoparser seems to be the most advanced solution, however it's a service offered by a commercial company and unfortunately their 'terms of use' page returns a 404 error. Most probably they will not offer this service for free.

I wonder if it would be possible to create something similar to agenames which identifies location names and returns coordinate pairs. Maybe based on the geonames gazetteer?

Friday, July 4, 2008

The new Stratigraphy.net pages

Today, I released the new Stratigraphy.net pages which are a real improvement in comparison to the old Snet homepage. The site structure is simpler and much easier to navigate. It now contains a data portal which allows searches on Snet data from our subprojects TaxonConcept and Agenames. I already wrote some comments on the technical details of the new data portal which is based on the OAI-PMH (Open Archives) protocol.

Interestingly, the Snet data page allows even more flexible queries than the original search pages of TaxonConcept or Agenames. For example authors of taxa or localities of stratigraphic units can be used in queries while TaxonConcept and Agenames only allow queries on the taxon name or name of a stratigraphic unit respectively.