Category: Technical Leadership: Data Resources

Parent categories: Technical Leadership

The ELIXIR Core Data Resources fundamental infrastructure for the life sciences, the ELIXIR team

The ELIXIR Core Data Resources: fundamental infrastructure for the life sciences, the ELIXIR team Funding of research computing and data resources is hard. As was pointed out in a recent blogpost at US-RSE, research infrastructure is generally funded as a project, which ends (“and they all lived happily ever after”), rather than as a product which continues to be used; “sustainability” is a word that comes up very quickly in research computing conversations. This is a good paper that makes a familiar case for funding...


Dataset data sheets

Datasheets for Datasets Template - Audrey Beard Datasheets for Datasets - Timnit Gebru et al, arXiv:1803.09010 Beard provides a LaTeX template for Gebru et al’s suggested “Datasheets for Data sets”, a human readable high level description of a dataset - not a data dictionary, but describing the reason the data set exists, how data was collected, what preprocessing/cleaning/labeling was done if any, how or if maintenance will be done, what uses the dataset has been put to, and more.


Developing a modern data workflow for regularly updated data - Glenda M. Yenni *et al*, PLOS Biology

Other tags: | Technical Leadership: CI/CD |

Developing a modern data workflow for regularly updated data - Glenda M. Yenni et al, PLOS Biology Updating Data Recipe - Ethan White, Albert Kim, and Glenda M. Yenni This one’s a couple years old, and I’m surprised I hadn’t seen it before. It’s getting easy to find good examples for scientists of getting started with GitHub, and then to CI/CD, for code. But for data it’s much harder. And there’s no reason why experimental data shouldn’t benefit from versioning, and analysis pipeline CI/CD that...
