Public Datasets

In December 2010, an initiative was undertaken to share data from public databases amongst users on LinGA. If one researcher needs all the gene descriptions and locations from UCSC or NCBI, it makes sense to make it available for other people to use. This saves space on the servers, will reduce costs, and make us more efficient as a group. These databases can be updated each week/month etc as required. There is already some shared information available at /data/share/mirrors – we will add to that information

The current phase (December 2010) is to gather a list of files/databases that you would ideally like to store in this shared place on the LinGA system. We can not guarantee that we can store each and every file you request but we will consider things depending on size/ease of updating. We’re not just talking about database files – if there are interfaces (SQL related?) that would be required, we will consider that as well.

The next phase (from January 2011) will be about downloading the files that we can and storing them in a shared place, and organize regular updating to keep them current and therefore useful.

If there is a particular file or files you would like stored on LinGA, please email scott7 at to be added to the list below. This shared directory will probably not be backed up like your other files as it contains information available from public databases that we can obtain again.

List of currently requested directories/files (updated from emails received): (all files in directory) (all files in directory) (all files in directory) (SOME files likely to be stored – please email which ones you need)

Note: There are already some files stored /data/share/mirrors