DNA diversity index script

Client: Julian Marchesi
Technologies used: Python

The client required to calculate the DNA diversity index, which required the reading of a large file that had a list of DNA sequences grouped by species in it. A second file contained a matrix of sequence distances.

The sequence dataset could contains 500 000+ values, while the matrix dataset could contains 10 000 x 10 000 distance values. The greatest challenge here was creating an algorithm that takes the smallest amount of time possible to read the data and calculate the required value, because the script would be ran on multiple datasets.

  • Twitter feed

    Freelance software and web developer who also gets a kick out of cooking.

    @olafhartig Really liked your talk about calculating complexity in GraphQL. Will you be taking your studies any further?

    @Coppertino I can't seem to login into my LOOP account or access the vox.rocks page.

    So glad to be a part of @Thinkful which just turned 5! Fellow mentor @sarajchipps says it best: “So many of us [dev… twitter.com/i/web/status/9…

    @lottapub Hi Lotta, for some reason haven't seen this, the plugin still works as should so no updates are needed. I… twitter.com/i/web/status/9…