Add labels and associated features #527

JRegimbal · 2020-05-04T17:22:38Z

The following features are added in this pull request:

implementing resource labels with unique names
allowing resource label names to be updated through the API
ability to filter resources by associated labels (AND not OR for multiple labels)
addition of "Labeler" job that adds a label to specified resources generated in a workflow run
generation of a "snapshot" with an overview of the workflow run in its description

Otherwise this will fail if a (valid) empty string is given. An empty string should be processed so that all tags can be removed, but empty strings are falsey in Python so check against None.

rodan/jobs/labeler.py

rodan/models/resourcelabel.py

deepio · 2020-05-04T19:12:49Z

rodan/settings.py

@@ -134,6 +134,7 @@
 RODAN_JOB_PACKAGES = []
 BASE_JOB_PACKAGES = [
    "rodan.jobs.resource_distributor",
+    "rodan.jobs.labeler",


We will need to group these 2 eventually into a core rodan job package. Doesn't have to be now.

rodan/jobs/base.py

rodan/jobs/labeler.py

rodan/views/resourcelabel.py

* Renamed production.env, consistent with rodan-docker. * Added: Gunicorn and Celery environment variables. * Added: Admin url as an environment variable * Revert "Add support for RODAN_NON_INTERACTIVE in load.py." We do not need to perform this action in rodan-docker anymore since removing configure. This reverts commit 458cd03. * [WIP] Added: SSL Certificates to Rodan. (#474) * Added: SSL and domain vars [WIP] * Added information about the redis configuration. * Fixed: Redis DB env typo and django logger level. * Added: Base Superuser account, defined in env file. * Fixed: Admin env variable typo. * Fixed: Have iipserver hit the nginx service inside docker. * Removed: Trailing whitespace from env-files. * Removed: Cleaned websocket whitespace. * Added: httpport for websocket [WIP] see description. Still debating between using a unix socket in a docker volume (not a bind mount) or using http requests between containers to make this work. Right now, i'm using http requests to test. Unix sockets should be faster or not depending on how docker volumes update each other. * Added: Django environ. * Added: Static files for collection. * Modified: Unified the codestyle in settings.py * Updated: python-dateutil. * Updated multiple dependencies, and made the code uniform through the entire project. (#476) * Modified: Black-ed all the code. Too many developers did not keep the same coding style, all pages are divided in two or three different styles, python black will take care of the minutiae of the formatting. * Added: More detailed documentation in settings.py. * Modified: Reorganized admin, debug and test runner in settings.py * Added: Django-environ. * Added: Staticfiles for gunicorn. * Fixed: typos in settings.py * [WIP] dateutil. * Removed: Tornado is not being used anymore in this project. * Updated: Django to 1.8.19 * Updated: PyYaml * Updated gunicorn. * Alphabetized [Temporary, sort them by usage later.] * Updated: pillow to 5.2.0. * [WIP] Removed paramiko and flower, see description. They are not used and not setup. Flower would be helpful in the future, let's make the app work before adding other elements. * Updated: Django rest framework to 3.3.0. * Fixed: Removed django 1.7 (python < 2.7) compatibility with REST. * [WIP] - Allow all cors in django. * Added: Travis build, test, and notifications. * Removed: Redis is already defined in requirements. * Fixed: psycopg2 and travis build setting. * Added: Environment variables to the build. * Added: Start the postgres server. * Modified: Hostname for travis. * Removed: Jobs from settings.py. It is better to have them added to the Rodan docker process, instead of having them here. * Added: Python redis for travis. Force install redis outside of virtual env for travis. * Fixed: Redis server address for travis. * Added: https for iipsrv. * Fixed: xargs, not a valid identifier. * Modified: Renaming folder structure in docker. * Fixed: django env arrays require unicode. * Added: pil-rodan and uncommented other unittests. * Added: contrib.session to settings.py * Fixed: wrong dir in travis. * Modified: helloworld rodan_job version example. * Modified: [WIP] There is an issue with using travis/docker/env_file together for django. * Fixed: typo in staging.env * Modified: Permissions. * Updated: Copyright version. * Fixed: postgres naming convention. * Fixed: Missing renaming env * Modified: env variable for postgres * Updated: CORS headers. * Removed: theano env variable. * Added: RabbitMQ env variable. * Fixed: Path issues with gunicorn * Modified: Reformated django test * Modified: Input/output port types. See description [WIP] The input and output ports on a job are very volatile during development. For this reason, the ability to change the name of an input/output port has been added to their models. Jobs will remain in the workflows, but a check should be added to workflowjobs when `input_port_types` and `output_port_types` are removed. (It should repopulate the correct value automatically) * Modified: Refactored RD, Rodan, and Celery * Modified: Updated dependencies * Modified: Removed CORS methods * Fixed: test_post_image * Fixed: travis/docker log directory separation * Modified: Formatting * Moved: Env files to docker repo. To avoid having to fetch the rodan submodule, the template env file has been moved to the docker repository. * Fixed: Rodan settings bool * Added: Set env vars for travis build * Added: Django Healthchecks * Fixed: Debug bool default * Modified: should view API in prod * Modified: middleware messages * Update: Flower * Fixed: Resource Routing * Update .gitignore * Removed: Static files There are symbolic links and static files that are no longer used in Rodan. They were kept in the repository for documentation, but now are not useful any longer. * . whitespace * Removed: pil-rodan from job packages Rodan does not use submodules, but it also will not work without pil-rodan. This adjustment changes that. The installation process is outside of rodan, and inside the docker build. Rodan should work on its own, but some unittests will not work without pil-rodan. Adjustments need to be made accordingly to the docker build. * Delete .gitmodules * Update resourcelist.py Removed the null field on Many to Many model because it has no effect. But it does produce a warning. * Created: job_queue routing * Added: Job_queue Auto Population in DB The job_queue field in the database now gets populated by the rodan_jobs themselves and settings are saved in the DB. * Added: job_queue description in models definition * Update .gitignore with pixel_wrapper * Modified: Print to print function * Init: Modify database with job_queue and resourcelist manytomany field * Modified: iteritems to items items() is inefficient on python2, but once gamera is finally ported to python3 and we no longer need to support python2 this will not be an issue anymore. * Fixes: thread vs. threading defenition * Fixes: Relative import for diva_generate_json * Fixes: Mixed tab/space indents * Fixes: input compatibility * Modified: Do not require all job installations for individual celery queues. The main celery queue needs all jobs installed, this is very important. * Removed: Unused dependencies from requirements.txt * Updated: django-websocket-redis * Updated: Flower The old version of Flower created issues with the python3 installation of Rodan. mher/flower#878 * Modified: Multiple django hosts There are some issues with passing [ ] ' " through environment variables. Instead we create the list dynamically with the added benefit of hosting over multiple domains should we need it. * Added: New jobs to gitignore * Added: HelloWorld with python3 Removed the interactive helloworld job until it is fixed. * Moved: Rodan Core jobs to celery * Update .travis.yml * Fixed: Resultspackage require a packing_mode * Added: Travis-Slack integration * Added: Now able to create resources from a script * Added setTimeout() method to close function so text file can save and job can complete * Added: Added JPEG2000 type because of Kakadu * Modified: Restructured settings for job queues The rodan core image must have ALL rodan jobs listed in settings.py, because it loads all the jobs into the database. Python2 jobs can not be in the python3 list because it will try to load them anyway and crash. This is only important when working as a volume mount of the rodan folder into the docker containers. When the containers are up, it does not matter very much because they each have their own settings.py anyway (different containers, different fs). * Added: Tim's jobs to the gitignore * Update .travis.yml * Modified: Make create_diva not rely on pil-rodan * Modified: no longer required to read temp data for KG * Fixed: Project and Files field must be specified * Update .travis.yml * Update .travis.yml * Hard coded pybagit path * Added OpenJPEG for diva creation * Force mastertask to take celery queue * Fixed formatting * Update workflowrun.py * Update core.py * Update settings.py * Fixes: extensions are removed from upload * Update resource.py * Update resource.py * Fixed: unreferenced attribute * Fix outdated error messages in resource assignment tests * Adapt wfrun creation logic to handle multiple resource collections * Validate multiple same-length collections during wfrun creation * Add new hello world job supporting multiple ports * Add test case for creation on multiple resource collections * Modified: Return error response when deleting resource has failed. * Use a ResourceSerializer when deleting a resource We need to give some information on the resource we are deleting when responding to client so use a ResourceSerializer to do that. * Move resource types to Rodan itself. This can be accompanied by deleting the resource_tyeps.yaml files in the various jobs. Ideally this should make the Python 3 container run properly since it (and any other container) has access to all resource type definitions without needing the jobs to be loaded. * Update requirements.txt * . * Fixed: django filters * Fix urls * Fixed: Templates * Fixed: Serializer fileds * Add listing for hpc-fast-trainer to settings.py * [STUB] django-guardian/django-extesions update * Fixed: test_get_list from connection api objects.all returns a Queryset in django 1.11 self.assertEqual(str(response_connections), str(Connection.objects.all())) would not work * sort imports * Added: Filetype identification * Update year * Update .gitignore * Added: Flake8 * . * Create alter_rodan_job.py * Updated: DRF to 3.5.4 and django-filter 1.1.0 * Update .travis.yml * Travis-Unittest There are two unitest that always fail on travis an not locally. * . * Update .flake8 * [STUB] * Update test_workflowrun.py * . * Added: Test files to filetype ident * Split in handler. * Added: Json test * Update load.py * Fixed bare exception * Fixed: Travis check * Fixed: Resource Identification * Updated: Model-mommy * Updated: DRF * Updated: Gunicorn * Update helpers.py * Update resource_identification.py * Update resource_identification.py * Update resource_identification.py * Updated: Django-guardian * python2-3 urlparse * Update urls.py * Update for Django templates * Update settings.py * Update settings.py * Update __init__.py * Update urls.py * Fixes: Removes dependence on django pattern * Update urls.py Fixes an unused import statement. * Identify RGBA png files * Fixed: flake8 * Update test_mimetype_identification.py * Add queue for GPU * Update psycopg2 to 2.8.4. * Remove ecdsa dependency * Update requirements.txt * Update test_userpreference.py * Update resourcelist.py * Fixes test post user preferences ERROR: test_post (Rodan.rodan.test.views.test_userpreference.UserPreferenceViewTestCase) * Resource_list moved to get_resource_type * Update requirements.txt * Update workflowrun.py * Update base.py * Create 0017_resourcelist_resource_type.py * Update urls.py * Update resource.py * Update urls.py * Update urls.py * Updated: Djoser * Update urls.py * Update resource.py * Update requirements.txt * Fixes: #514 * . * Create 0018_auto_20200318_2345.py * . * Change how class-based jobs are registered in celery * Downgrade Celery to 3.1.25 (last 3.x release) * Fixes: 'QuerySet' object has no attribute 'workflow_job' * Update core.py * . * Modified: Move registering tasks to celery file. * Update load.py * Version Update This version of Rodan has upgraded security and dependencies, furthermore it has 4 different queues for jobs to run in. * Import rodan.jobs.load in urls.py * Update urls.py * Update urls.py * Define EMAIL_USE to False This should resolve #521. * Move to /api url (#524) * Move to /api url * Fix unittest urls * flake8 everything * Update requirements.txt * Include django upload memory settings DATA_UPLOAD_MAX_MEMORY_SIZE and FILE_UPLOAD_MAX_MEMORY. This is to prevent a 413 HTTP status on resource uploads. * Add labels and associated features (#527) * Implement labels * Support for patching resource labels * Remove labels when they do not apply to any resource. * Migration for resource label addition and labels field on resource * Increase label length * Generate runjob label on generated resources * Remove commented out lines * Add test for ResourceLabel model * Include label creation in test * Clean up additions to resource view * Check if list is not None Otherwise this will fail if a (valid) empty string is given. An empty string should be processed so that all tags can be removed, but empty strings are falsey in Python so check against None. * Pass linter * Make label names unique * Apply AND to labels not OR * Migration for resource label * First attempt at "labeler" job * Add snapshot info to workflow run description * Add labeler job to settings.py * Atomic update of description * Add ports to Labeler, only apply UUID on blank Labeler job * Use get_or_create to prevent race conditions * Add name to resource(s) for snapshot * Update for linter * Don't fail just because of an issue with snapshots * Linter and code review * Make labeler default the workflow run UUID * Reduce occurrances of SIGKILL issue (#528) * Ensure temporary file is deleted * Catch failure on OOM and wait 10 s. Retry 3 times. This seems to work in most cases when testing locally and doesn't result in the same kind of slow down as with decreasing the concurrency parameter of the celery worker(s). * Properly handle multiple input ports for labeler * Allow for Labeler inputs to appear as results Typically inputs that belong to that workflow run prevent a resource from appearing as a run result but add an exception for Labeler * Ignore added label names that are too long * Set limit on label length in Labeler * Add support for getting resources in zipfile (#531) * Add support for getting resources in zipfile * Singular "resource_uuid" as GET parameter * Add archive endpoint to routes * Tweaks to make the client happy * Add queue "celery" * Add format=format * Don't return empty zip file * Let FileResponse close StringIO itself According to https://code.djangoproject.com/ticket/29278 this kind of response will close the file-like object itself. * Add test for archive * Make linter happy * Handle files with conflicting names * Use six to get StringIO * Use six.moves.range * Make linter happy since six only in doctests * Restore settings.py. Oops. * Modified: six import * whitespace * replace StringIO with six Co-authored-by: Alex Daigle <alex.daigle2@gmail.com> * Remove print statement. Sorry, thought this was removed in the last PR. * Check for MIGRATE env var. This is to allow us to not need to comment in/out these lines to perform migrations. * Add biollante-rodan job to settings.py. (#533) * Ignore biollante-rodan job * Don't show description of workflowrun in list Description field has no size limit and can contain a list of information for the entire workflow run. Depending on the job, it may contain a large amount of info as it will serialize the settings object with can contain any information needed to be preserved in phases of an interactive job. Since this can be so big and massively increase the response object size and is also not very helpful for finding a workflowrun, remove it. * Replace ODD file with MEI all for 4.0.1 (#539) * Include references to MEI_resizing * Add PIL-Rodan to py3 worker Already installed due to calvo, but must be explicitly loaded as Resize job uses py3 queue. * Migration to rename Classifier job Rename existing Classifier job to Non-Interactive Classifier as part of DDMAL/gamera_rodan#8. This should be merged into Rodan for the *same build* as the updated gamera_rodan job. Otherwise default behavior will delete the Classifier job from the database rather than rename it. * Check for 0-length resource labels * Update permissions so users can access labels * Remove unused import * Modified: Handle more filetypes * STUB: Prepare for move to grok * Fixes: Flake8 * Fixed: Flake8 * Removal coordinate set models (#546) * Added: Stub * Created: Work-In-Progress Data Migration * Fixed: Tests * Flake8 * Version update Co-authored-by: ImaneChafi <46538274+ImaneChafi@users.noreply.github.com> Co-authored-by: Evan Savage <evansavage@Evans-Mac-mini.local> Co-authored-by: Ling-Xiao Yang <ling@theyang.ca> Co-authored-by: Ling-Xiao Yang <lingxiao.yang@britecore.com> Co-authored-by: Silver92 <wen.xiao@mail.mcgill.ca> Co-authored-by: Juliette Regimbal <juliette.regimbal@gmail.com>

JRegimbal added 23 commits Apr 17, 2020

Implement labels

Loading status checks…

0a8b5dd

Support for patching resource labels

Loading status checks…

c7ad70d

Remove labels when they do not apply to any resource.

Loading status checks…

88eb63e

Migration for resource label addition and labels field on resource

Loading status checks…

96b550d

Increase label length

64c3d3d

Generate runjob label on generated resources

bdeb820

Remove commented out lines

Loading status checks…

be52bb7

Add test for ResourceLabel model

8ac36e2

Include label creation in test

0ec9117

Clean up additions to resource view

Loading status checks…

170f19f

Check if list is not None

Loading status checks…

765484f

Otherwise this will fail if a (valid) empty string is given. An empty string should be processed so that all tags can be removed, but empty strings are falsey in Python so check against None.

Pass linter

Loading status checks…

25d4d10

Make label names unique

Loading status checks…

f63dede

Apply AND to labels not OR

76b6120

Migration for resource label

Loading status checks…

5d56dda

First attempt at "labeler" job

Loading status checks…

7b50798

Add snapshot info to workflow run description

cce400f

Add labeler job to settings.py

Loading status checks…

836b0f0

Atomic update of description

Loading status checks…

ae47592

Add ports to Labeler, only apply UUID on blank Labeler job

Loading status checks…

b04a952

Use get_or_create to prevent race conditions

Loading status checks…

5e4141a

Add name to resource(s) for snapshot

bcb7165

Update for linter

Loading status checks…

22d23ae

JRegimbal requested a review from deepio May 4, 2020

JRegimbal mentioned this pull request May 4, 2020

Add support for label features DDMAL/rodan-client#156

Merged

deepio reviewed May 4, 2020

View changes

JRegimbal added 2 commits May 4, 2020

Don't fail just because of an issue with snapshots

Loading status checks…

62e01e3

Linter and code review

Loading status checks…

645c20e

JRegimbal deleted the label-test branch May 4, 2020

Nov	DEC	Jan
	01
2019	2020	2021

DDMAL / Rodan

Add labels and associated features #527

Add labels and associated features #527

JRegimbal commented May 4, 2020

This comment has been minimized.

DDMAL / Rodan

Join GitHub today

GitHub is where the world builds software

Add labels and associated features #527

Add labels and associated features #527

Conversation

JRegimbal commented May 4, 2020

This comment has been minimized.

deepio May 4, 2020 Member

Essential cookies

Always active

Analytics cookies

deepio May 4, 2020
Member