Tuesday, October 23, 2012

Biodiversity Data Journal - archiving or self archiving?

Another interesting development in the world of research data management: Pensoft announced that the Biodiversity Data Journal (BDJ) will start accepting submissions in December 2012. The BDJ is a new data journal similar to the Earth System Science Data Journal (ESSD) which pioneered in publishing research data.

Biodiversity Data Journal (BDJ) is a community peer-reviewed, open-access, comprehensive online platform, designed to accelerate publishing, dissemination and sharing of biodiversity-related data of any kind. All structural elements of the articles – text, morphological descriptions, occurrences, data tables, etc. – will be treated and stored as DATA.

So far this sounds very good. But IMHO not so good is the data archive policy of BDJ, which is a bit vague.

The BDJ journal homepage only provides a link to the general pensoft data publishing guidelines which offers many options how one should archive data. Of course they mention Dryad and PANGAEA etc. which are suitable data archives. But it is hard to understand why GBIF's Integrated Publishers Tookit (IPT) and scratchpads are also included as an option.

GBIF is not an archive and both, scratchpads and the IPT are software solutions which can be used for archiving purposes but need to be hosted by a reliable organisation - but these are not named. Therefore, in its current form, pensofts data archiving policy could be understood as an invitation to self archiving ... or does BDJ e.g. mean IPT in the sense of "pensoft IPT"?. This should be fixed until December.

Friday, June 8, 2012

Don't build on external APIs

For one my older projects (agesearch) I needed access to search engine results which I used to 'calculate' a stratigraphic context of given keywords. The idea was to visit all pages the search engine returns and to use Ageparser to scan these pages for stratigraphic terms to estimate their chronostratigraphic context which should be representative also for the entire search phrase.

I started to use the YAHOO search API which nicely worked but after a while YAHOO decided to restrict free access and started to charge for the use of their API. So I switched to Google's Search API which also worked well and allowed me to access an even larger website index.
However, now also Google started to charge for API usage which again forced me to look for an alternative. So I came to Microsoft's BING API. This service also worked pretty good and now agesearch seems to work again. Microsoft still offers free access on this API but I could hardly believe to see that they recently placed it on their 'Windows Azure Marketplace' and offer several upgrade option.

So don't trust too much on external APIs even when offered by the big web elephants. Their strategy to introduce a new service for free and to restrict access and to charge for usage after a while truly can be a trap for your projects.

Monday, January 9, 2012

Termination of National Biological Information Infrastructure (NBII)

As earlier annouced at the whitehouse website, a major budget cut forced the USGS to terminate several initiatives. Among those, all services of the National Biological Information Infrastructure (NBII) will be shut down or transferred to yet unknown locations (maybe data.gov). This will happen at January 15, 2012 so if you have to work with the NBII services ... hurry up. These are really bad news, as I enjoyed the excellent web services of e.g. ITIS.

For those who want to know what will happen to all the tools, databases and services hosted at the NBII website, here are the FAQ: http://www.nbii.gov/portal/server.pt/community/termination_of_nbii_program/2057/termination_faqs/7650.

Most delicate, this termination will also be a shock for the WDC (now WDS system), as the NBII was also hosting the World Data Center for Biodiversity and Ecology!


For those who want to know what will happen to ITIS:

"What will happen with related USGS Biological Informatics Programs such as the Integrated Taxonomic Information System (ITIS), USGS Gap Analysis Program (GAP), and USGS/NPS Vegetation Mapping Program?

The former USGS Biological Informatics Program (now part of the USGS Core Science Analysis and Synthesis Program) is the parent program to the NBII, ITIS, GAP, and the Vegetation Mapping Program. While the NBII Program has been terminated and its funding eliminated, ITIS, GAP, and the Vegetation Mapping Program remain funded activities and will continue to provide high quality data, services, and cyberinfrastructure to collaborators and users."