Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upUpdate to PEP stack 2.0 #23
Comments
|
Sorry for the delay, I wanted to get myself up to speed again on this topic - The last thing I managed to do when upgrading to PEP2.0 was to generate exactly the same shell submission script for each sample. However, upon running the submission script, there was an error that would lead the pipeline to fail coming from the ATAC-seq pipeline (attached here) Sample_submisson_script.log We were just discussing this issue yesterday in person w/ @berguner , @sreichl, but I'm happy to continue it here, or elsewhere. In the meanwhile, I'll email you examples of the pipeline configs, and project configs, etc. to help with the transition, as attaching *.yaml, *.csv, and *.tsv doesn't seem to be supported here |
|
Alright, I just had a go at it. Please see the Nonetheless, the changes really are in the configs, here's a little summary:
I used the @fwzhao, @sreichl are you willing to test this out? Report back and then we can merge this to |
|
I'm trying to understand the changes, and had a couple questions about the ATAC pipeline... why does the ATACSeqSample class no longer inherit from peppy.Sample? And does self.sample_root come from the series resulting from reading the sample yaml (since this is what's referenced to format many other sample attributes, e.g. mapped) I can test it out, give me a couple days :). |
|
Well I tried to explain here:
Basically before, the object had knowledge of its output directory only through the project attribute (see https://github.com/epigen/open_pipelines/blob/master/pipelines/atacseq.py#L55). This was kind of a hack and was the only reason why ATACseqSample inherited from peppy.Sample was so that the yaml structure would be reconstructed into an object with the right hierarchy (i.e. including the Let me know if it works. |
|
Any updates? I'm afraid I won't have much time to spare after tomorrow. |
|
Yep - I'm done. Test run successfully after a few tweaks to open_pipelines and microtest. Also got TSS enrichment to run again (it was missing mapping to hg19 in the atac yaml). Not closing issue just yet, but will run it on an actual sample now. |
|
Thanks. I created a PR already, but let's see if we want to update the other pipelines too first. |
|
@sreichl do you want to test this too on a CLL sample? Let's see if the project config is good and the hg38 resources are well referenced. |
|
@sreichl I see the Also, do you really want to point to a directory called "project_folder" for the pipeline outputs? https://github.com/epigen/cll-progression/blob/master/metadata/project_config.yaml#L26 |
|
I tested on a real sample, and it worked :) |
okay will do. project_folder: The project_folder is always a symlink pointing to the respective project_folder wherever this is (eg HPC). That way I can have repo clones in different locations (ie local, HPC, other) and only have to specify the symlink in each to be able to continue working. No more having an unsynced repo clone within the project folder on the HPC. |

Formed in 2009, the Archive Team (not to be confused with the archive.org Archive-It Team) is a rogue archivist collective dedicated to saving copies of rapidly dying or deleted websites for the sake of history and digital heritage. The group is 100% composed of volunteers and interested parties, and has expanded into a large amount of related projects for saving online and digital history.

Hi,
I'm no longer using this code, but I'm still collaborating with @sreichl on projects that use this.
I've heard there's some trouble upgrading this to work with the PEP stack>=2.0.
@fwzhao I believe you did some work on this on the project side to upgrade project configs, etc.
Do you want to share your progress, and any issues you might have so we can start upgrading the pipelines?
Anyone else interested, please pitch in.