Script not running in the submission

Good morning!
As we tested out our docker container submission on our own, the docker container ran the python script as an entry point just fine, but as we submitted our docker submission via TIRA, it ran into the following issue running the script:
error_tira
What could be the issue to this?
Our Dockerfile pulls from the following image:

FROM python:3.7.12-slim-buster

and uses the following entry point

ENTRYPOINT [ “python”, “/run_task_1.py” ]

Thank you for contacting me, I just wrote a mail to you in parallel :slight_smile:

Have you tried my tip with the shebang in my parallel mail?
But it is very interesting that it works when used from the entry point, maybe that’s because of a different path (I will look into this).

Best Regards,

Maik

Ah, yes, indeed, now that I see your entrypoint, it is clear to me that the entrypoint works but the software not.

In your entrypoint, you execute the program python, which has /run_task_1.py as parameter (and python in that case runs your script).

But the command that you specified was just /run_task_1.py, so it thinks that this is an executable program. You can resolve this by either changing the command that is executed, e.g., to python /run_task_1.py, or (and I think this is the better approach) specifying the Shebang of /run_task_1.py and make this executable.

Does this help you?

Best Regards,
Maik

Ah sorry, concerning the /run_task_1.py command thats a typo and is meant to be /run_task_2.py.
Concerning your email, we can confirm that we did not initially add a ‘shebang’, so we will try it out. Any specific tips on finding the python executable within the docker image?

On the second point, I assume we should try:
ENTRYPOINT [ “python”, “ python /run_task_1.py” ]
in that format?

On a third note, in the baselines examples, the code tests out the docker containers locally via the tira.run command from the tira library. Are there any instructions on where to get and install this tira python library? A quick search yields no useful results.

I think the best way to figure out the python executable in your docker image would be to run which python.

There is not yet a library for tira, the baseline used this script that you can use to test your submission locally.

I also saw that you are now one step further, and your software now failed because it could not parse the shebang :slight_smile:
So we are almost there :slight_smile:

I think this is now an encoding problem: “/usr/local/lib/python3.7^M” is not a valid python executable because of the “^M”. The “^M” is the windows linebreak, but as the docker image runs some linux, it does not pay attention to the “^M”. Can you maybe convert the line breaks to linux ones? I think one can configure this for all editors?
I also know that some tools (i think git) automatically convert linebreaks to the linux linebreaks.

Do you have an idea on how to tackle this?

Best Regards,
Maik

Thank you, we managed to change the linebreaks of the run_task_2.py file from the CRLF to the LF format with our editor configuration.
The current error we are sitting on involves a pemission issue, which we do not yet know how to solve yet.

/scripts-2761-18357/step_script: /run_task_2.py: /usr/local/lib/python3.7: bad interpreter: Permission denied

Alright, thanks for letting me know, I look into this and will report back as soon as I know more!

Ok, I googled a bit around, and it turns out the preferred way to specify the shebang for python is #!/usr/bin/env python3. (I saw this here: python - Purpose of #!/usr/bin/python3 shebang - Stack Overflow )

I just tested this inside your container and it worked, so can cou please make an additional submission with the shebang #!/usr/bin/env python3? Then it should work (I have not tested your script, only that this as shebang works in your pod).

Thanks, the provided shebang #!/usr/bin/env python3
worked just fine.
The current problem now lies at our script in the uploaded image executing but the script not finding the local model

01/24/2023 12:38:38 - INFO - farm.modeling.language_model - Could not find saved_models/qa-model-task2 locally.
01/24/2023 12:38:38 - INFO - farm.modeling.language_model - Looking on Transformers Model Hub (in local cache and online)…

but when we run a container locally with this image, the script executed in the container runs just fine

2023-01-24 13:44:32 01/24/2023 12:44:32 - INFO - farm.modeling.language_model - Model found locally at saved_models/qa-model-task2
2023-01-24 13:44:35 01/24/2023 12:44:35 - INFO - farm.modeling.language_model - Loaded saved_models/qa-model-task2

What could be the cause of the local container being able to find the model within the image, while the uploaded image fails to do so?

Suggestions would be appreciated! Our current dockerfile just uses the following copy command for copying the files of the repo into the image

COPY . .

Nice that the shebang worked.

I think the problem that you now face can be resolved by specifying the absolute path where the models are stored. E.g., the path saved_models/qa-model-task2 is a relative path, so they might become invalid when the environment changes. (which also happened here.)

I looked into your image, and it seems like the saved_models directory is directly in the root / of the system. I.e., it should work if you switch the path from saved_models/qa-model-task2 to /saved_models/qa-model-task2

Can you please try it again with this adjusted path?

Thanks, with these instructions, the python scripts seem to be running without an error so far!

Awesome, I am glad to hear this!