Skip to main content
17 events
when toggle format what by license comment
Nov 28, 2017 at 15:08 comment added Gnudiff @igal I was not completely certain of that. However, in case you are right, there probably are some rows that somehow reset the result?
Nov 28, 2017 at 13:33 comment added igal @Gnudiff I had the same initial thought as you, but we're being told that the program runs to completion without error. I would have expected it to hand or crash if it ran out of memory - don't you think?
Nov 28, 2017 at 10:53 comment added Gnudiff If the files are very large, processing them in memory might take a long time and/or exhaust memory. In that case, you would probably be better off changing this solution into sqlite one, putting the rows into SQL db and running a query on them. For sql purposes, the structure seems very simple.
Nov 28, 2017 at 5:29 comment added igal @Anna1364 You can sign up for GitHub for free. Or you can share a public link from a cloud hosting server (e.g. Google Drive, DropBox, etc.).
Nov 28, 2017 at 5:06 comment added Anna1364 @igal, I do not have any GitHub page! Any other idea where can I share my data with you? Your code gives exactly what I want, just does not work with the entire data. Thanks for your help! –
Nov 27, 2017 at 6:12 comment added igal @Anna1364 I'd rather not post my email address. Can you put it somewhere public?
Nov 26, 2017 at 21:51 comment added igal @Anna1364 If you post your data somewhere (e.g. GitHub or something) then I'll running the script on your data myself.
Nov 26, 2017 at 21:49 comment added igal @Anna1364 Does it actually terminate without producing any output or does it just run for a really, really long time? I didn't put any effort into making it efficient. If your input is really large than it might take a long time or possibly hang or crash.
Nov 26, 2017 at 21:15 comment added Anna1364 @igal, thanks so much. I tried your python code with a small subset of data which works perfectly fine. But there is a problem when I run it for the entire dataset! I have nearly 10 million SNPs, when I run the script for the entire data-set it does not produce any output! I wonder what might be wrong....?
Nov 26, 2017 at 4:10 history edited igal CC BY-SA 3.0
Removed double-quotes to match updated question.
Nov 26, 2017 at 2:28 comment added igal @iruvar Thank you for the feedback - updated.
Nov 26, 2017 at 2:27 history edited igal CC BY-SA 3.0
deleted 18 characters in body
Nov 26, 2017 at 2:25 comment added iruvar very good, +1. int(start) <= int(position) and int(position) <= int(end) is idiomatically int(start) <= int(position) <= int(end)
Nov 26, 2017 at 2:22 history edited igal CC BY-SA 3.0
Corrected solution.
Nov 26, 2017 at 2:05 history edited igal CC BY-SA 3.0
added 987 characters in body
Nov 26, 2017 at 0:59 history edited igal CC BY-SA 3.0
added 1729 characters in body; added 74 characters in body; added 8 characters in body
Nov 26, 2017 at 0:52 history answered igal CC BY-SA 3.0