#27 "Automatic upload" discutions

Open
opened 5 years ago by Zykino · 10 comments
Zykino commented 5 years ago

Since the next unchecked feature is about automatic upload (see below) I think we may start to gather information on what we need and how (by the way the point after this one about desktop may be checked isn't it?).

After a record/stream I enter the desired publication date and title in a file so I have them ready when I upload the video. This is helping me enter the good command line quickly, but if it could be interpreted by Prismedia directly it could be a welcomed help.

To do this we need to think about quite a lot of things, try to know how people are using Prismedia and how this function may or may not be integrated in their workflow.

Format of the file

There are a lot of standardized file format, a good one should be understood and easily/naturally by most peoples (and easy to parse/modify from Python):

Format Description PRO CON Known Parser (quick search) Yes/No
INI I think it is the format use for the NFO Same format => easier for the users, only one parser RawConfigParser
TOML Almost the same as the INI format with an actual spec: https://github.com/toml-lang/toml Easy for the user, strongly specified for the language Maybe replace the NFO format with this one since they are close to each other. toml
JSON JavaScript Object Notation Well known by programmer, good hierarchy Not that easy for lambda users, may be a bit too strict to use easily Native support import json
YAML A formal format with hierarchy Like JSON but more user friendly Easy to write by hand Native support import yaml
CSV An easy format which can be created with a spreadsheet (Exel/LibreOffice Calc) Usable from source or spreadsheet No hierarchy, no unicode reading by default Native support import csv
Other? I don't know all the format so if you know any that may be interesting please share with us.

Info needed in the file

We need to define all the information needed in this file and how to structure them hierarchically. We also need to define if some info need to be duplicated from the NFO or not.

Global info to a channel

  • OPTIONAL Platform (already in the NFO)

Global info to a playlist

  • NFO
  • OPTIONAL Playlist name (already in the NFO)

Local info to a video

  • Title
  • OPTIONAL Publish date
  • OPTIONAL Is it already uploaded on Peertube (Prismedia will change this on successful upload, if not present prismedia consider it to not be uploaded)
  • OPTIONAL Is it already uploaded on Youtube (Prismedia will change this on successful upload, if not present prismedia consider it to not be uploaded)

Daemon/Service or task called regularly (cron job)

We can create a daemon to watch if a file is modified / when a new video is present.
=> We need to make sure the video is fully synced on disk before uploading. It may be syncing to a server, or being recorded when we detect the changes.
=> Linux use Daemons and systemd/initd/... to start them. Windows use services. I don't know for Mac. => It start to look a lot of work/special tuning for each platform.

The other choice is to have an "auto" option with the file discussed above as parameter:
./prismedia --auto="autoFile.type"
The user can then either launch the command manually when ready or create a cron job (use the task scheduler on Windows) so the upload is done in background.
=> If we tell people to setup cron/task scheduler we have the same problem as with the daemon: we need to check that the files are fully synced on drive.

Final notes

I hope everything is clear (It's late right now and I've taken a long time writing it in bg task). Sadly I was only able to create an issue for this and not a Milestone.

Now it's your time to shine by bringing your ideas, needs and requests !

Since the next unchecked feature is about automatic upload (see below) I think we may start to gather information on what we need and how (by the way the point after this one about desktop may be checked isn't it?). > * [ ] Record and forget: put the video in a directory, and the script uploads it for you * [x] Usable on Desktop (Linux and/or Windows and/or MacOS) After a record/stream I enter the desired publication date and title in a file so I have them ready when I upload the video. This is helping me enter the good command line quickly, but if it could be interpreted by Prismedia directly it could be a welcomed help. To do this we need to think about quite a lot of things, try to know how people are using Prismedia and how this function may or may not be integrated in their workflow. ## Format of the file There are a lot of standardized file format, a good one should be understood and easily/naturally by most peoples (and easy to parse/modify from Python): | Format | Description | PRO | CON | Known Parser (quick search) | Yes/No | |:-:|:-:|:-:|:-:|:-:|:-:| | INI | I think it is the format use for the NFO | Same format => easier for the users, only one parser | | RawConfigParser | | | TOML | Almost the same as the INI format with an actual spec: https://github.com/toml-lang/toml | Easy for the user, strongly specified for the language | Maybe replace the NFO format with this one since they are close to each other. | [toml](https://pypi.org/project/toml/) | | | JSON | JavaScript Object Notation | Well known by programmer, good hierarchy | Not that easy for lambda users, may be a bit too strict to use easily | Native support `import json` | | | YAML | A formal format with hierarchy | Like JSON but more user friendly | Easy to write by hand | Native support `import yaml` | | | CSV | An easy format which can be created with a spreadsheet (Exel/LibreOffice Calc) | Usable from source or spreadsheet | No hierarchy, no unicode reading by default | Native support `import csv` | | Other? I don't know all the format so if you know any that may be interesting please share with us. ## Info needed in the file We need to define all the information needed in this file and how to structure them hierarchically. We also need to define if some info need to be duplicated from the NFO or not. ### Global info to a channel * OPTIONAL Platform (already in the NFO) ### Global info to a playlist * NFO * OPTIONAL Playlist name (already in the NFO) ### Local info to a video * Title * OPTIONAL Publish date * OPTIONAL Is it already uploaded on Peertube (Prismedia will change this on successful upload, if not present prismedia consider it to not be uploaded) * OPTIONAL Is it already uploaded on Youtube (Prismedia will change this on successful upload, if not present prismedia consider it to not be uploaded) ## Daemon/Service or task called regularly (cron job) We can create a daemon to watch if a file is modified / when a new video is present. => We need to make sure the video is fully synced on disk before uploading. It may be syncing to a server, or being recorded when we detect the changes. => Linux use Daemons and systemd/initd/... to start them. Windows use services. I don't know for Mac. => It start to look a lot of work/special tuning for each platform. The other choice is to have an "auto" option with the file discussed above as parameter: `./prismedia --auto="autoFile.type"` The user can then either launch the command manually when ready or create a cron job (use the task scheduler on Windows) so the upload is done in background. => If we tell people to setup cron/task scheduler we have the same problem as with the daemon: we need to check that the files are fully synced on drive. ## Final notes I hope everything is clear (It's late right now and I've taken a long time writing it in bg task). Sadly I was only able to create an issue for this and not a Milestone. Now it's your time to shine by bringing your ideas, needs and requests !
Poster
Owner

Thanks for the description!

I can confirm that the NFO files (and the peertube secret also) used the INI format, I think it's the easiest for user regarding yaml or json.
I think it's better to keep it except if we have special needs as the code is still here to deal with it :-)

The hardest part would be to detect if the file is finished. On Linux perhaps we could play with the "created" and "last updated" file characteristics. Need to do some tests here 😂

Thanks for the description! I can confirm that the NFO files (and the peertube secret also) used the INI format, I think it's the easiest for user regarding yaml or json. I think it's better to keep it except if we have special needs as the code is still here to deal with it :-) The hardest part would be to detect if the file is finished. On Linux perhaps we could play with the "created" and "last updated" file characteristics. Need to do some tests here :joy:
Zykino commented 3 years ago
Poster

Just thought about it:
Now with the multiples NFO #11, this features "only" requires to know which video is already uploaded and which is not.
We could have a file which only list the videos by filename (and path relative to this file), with their URLs. There should also be a way to list error there.
Launching Prismedia with the argument --auto=<path-to-file> or so would read this file, and for every file not having its URL Prismedia would try an upload.

Would be wonderful if we could have some auto files listed instead of only videos (so we could create some hierarchy).

To have a first version, I think we should take the assumption that the files are all presents. We also need #36.

A file in NFO/TOML would look like python -m prismedia --auto="autoUpload.cfg":

[autoupload]
subfolders = [
  "/other/folder/from/root/autoUpload.cfg",
  "folder/relative/autoUpload.cfg"
]

[videos]
Episode1.mp4 = { peertube = "<peertube-url>", youtube = "<youtube-url>" }
Episode2-1.mp4 = { upload-time = "2017-12-12T17:42:42", peertube = "<peertube-url>", peertube-publish = "2017-12-12T17:42:42", youtube = "<youtube-url>", youtube-publish = "2017-12-12T17:42:42" } # More complete version maybe ?
Episode2-2.mp4 = { upload-time = "2017-12-12T17:42:42", peertube = "<peertube-url>", peertube-publish = "2017-12-12T17:42:42", error = "youtube: 500 Internal server error" }
Episode3.mp4 = { error = "Time should be in future" } # (or whatever the error message is)
Episode4.mp4 # User set this file to be uploaded next time the autouploader runs
subSerie/Episode5-1.mp4 # Files do not need to be on the same folder
Just thought about it: Now with the multiples NFO #11, this features "only" requires to know which video is already uploaded and which is not. We could have a file which only list the videos by filename (and path relative to this file), with their URLs. There should also be a way to list error there. Launching Prismedia with the argument `--auto=<path-to-file>` or so would read this file, and for every file not having its URL Prismedia would try an upload. Would be wonderful if we could have some auto files listed instead of only videos (so we could create some hierarchy). To have a first version, I think we should take the assumption that the files are all presents. We also need #36. A file in NFO/TOML would look like `python -m prismedia --auto="autoUpload.cfg"`: ```toml [autoupload] subfolders = [ "/other/folder/from/root/autoUpload.cfg", "folder/relative/autoUpload.cfg" ] [videos] Episode1.mp4 = { peertube = "<peertube-url>", youtube = "<youtube-url>" } Episode2-1.mp4 = { upload-time = "2017-12-12T17:42:42", peertube = "<peertube-url>", peertube-publish = "2017-12-12T17:42:42", youtube = "<youtube-url>", youtube-publish = "2017-12-12T17:42:42" } # More complete version maybe ? Episode2-2.mp4 = { upload-time = "2017-12-12T17:42:42", peertube = "<peertube-url>", peertube-publish = "2017-12-12T17:42:42", error = "youtube: 500 Internal server error" } Episode3.mp4 = { error = "Time should be in future" } # (or whatever the error message is) Episode4.mp4 # User set this file to be uploaded next time the autouploader runs subSerie/Episode5-1.mp4 # Files do not need to be on the same folder ```
LecygneNoir added the
enhancement
label 3 years ago
LecygneNoir added the
Todo
label 3 years ago
Poster
Owner

The idea to use an "auto" file is great I think! It avoid the incertitude of "the file has finished copying or not?" with an easy implementation 👍

This is something we could definitively achieve.

I was also thinking about some "prismediad" to daemonize prismedia and check regularly files, but while I am pretty confident to achieve this on Linux, I have absolutely no idea how to do that on Windows and Mac, so at first I'll focus to have a strong "auto file" structure, manually launched with the --auto option :-)

With this, it should be easy to plan launch of prismedia from time to time (cron on Linux and Mac, planned task on windows?) and simulate daemon!

The idea to use an "auto" file is great I think! It avoid the incertitude of "the file has finished copying or not?" with an easy implementation :thumbsup: This is something we could definitively achieve. I was also thinking about some "prismediad" to daemonize prismedia and check regularly files, but while I am pretty confident to achieve this on Linux, I have absolutely no idea how to do that on Windows and Mac, so at first I'll focus to have a strong "auto file" structure, manually launched with the --auto option :-) With this, it should be easy to plan launch of prismedia from time to time (cron on Linux and Mac, planned task on windows?) and simulate daemon!
Poster
Owner

Now that #36 is ready to release, I think having #29 would help a lot to construct the file, and have a better logging system.

With these two issues done, all prerequisites for autoupload would be ok!

Now that #36 is ready to release, I think having #29 would help a lot to construct the file, and have a better logging system. With these two issues done, all prerequisites for autoupload would be ok!
Zykino commented 3 years ago
Poster

So I created another project (for ease of use/quick tests) that you can find here: https://git.lecygnenoir.info/Zykino/prismedia-autoupload. As of writing this comment I am on commit f16e11fa58.

Implementations details made me change a bit the file structure from my previous comment. A user will create an autoupload file with a content as follow:

# This field can be omitted if no other autoupload subfile is needed
autoupload = ["another/autoconfig/file.toml"] # list of strings, one for each sub autoupload file

[videos]
Episode1 = {} # Need not to have the ".mp4" suffix (it is implied), need the empty bracket (with or without space between)

The unit tests are using python unittest because it is in the standard library (if Prismedia need another one we will be able to change).
There is one class AutoUpload and 3 members functions to interact with it:
def nextVideo(self, platform, recursive=True): returns the next video to upload for a given platform
def success(self, platform, url, publishDate): valid the last returned video as succeeded. It need the url of the video and the publish date to set them in the autoupdate file.
def error(self, platform, errorString): valid the last returned video as error. It need a user readable error string so the user can now which error occurred.

For a same "run" , platform should be the same when calling nextVideo and when calling success or error. => I will change that, did not thought about it until writing this. So success and error won't have this parameter in the next version.

Well writing in plain text made me think to a lot of errors/things to test next but at least you have an overview of what is coming and can give me feedbacks 😅.

So I created another project (for ease of use/quick tests) that you can find here: https://git.lecygnenoir.info/Zykino/prismedia-autoupload. As of writing this comment I am on commit [f16e11fa58](https://git.lecygnenoir.info/Zykino/prismedia-autoupload/src/commit/f16e11fa58058a501feef525f148e0e6746eed98). Implementations details made me change a bit the file structure from my previous comment. A user will create an autoupload file with a content as follow: ```toml # This field can be omitted if no other autoupload subfile is needed autoupload = ["another/autoconfig/file.toml"] # list of strings, one for each sub autoupload file [videos] Episode1 = {} # Need not to have the ".mp4" suffix (it is implied), need the empty bracket (with or without space between) ``` The unit tests are using python unittest because it is in the standard library (if Prismedia need another one we will be able to change). There is one class `AutoUpload` and 3 members functions to interact with it: `def nextVideo(self, platform, recursive=True):` returns the next video to upload for a given platform `def success(self, platform, url, publishDate):` valid the last returned video as succeeded. It need the url of the video and the publish date to set them in the autoupdate file. `def error(self, platform, errorString):` valid the last returned video as error. It need a user readable error string so the user can now which error occurred. For a same "run" , `platform` should be the same when calling `nextVideo` and when calling `success` or `error`. => I will change that, did not thought about it until writing this. So success and error won't have this parameter in the next version. Well writing in plain text made me think to a lot of errors/things to test next but at least you have an overview of what is coming and can give me feedbacks :sweat_smile:. - [x] do I write the correct file on success/error ? (apparently I always write the original file) - [x] remove the need for `platform` parameter in the success/error functions - [x] make the success/error function fails if nextVideo was not called (or returned no videos) - [x] BONUS: have 2 class so the library user cannot call success/error on a not found video. It will make the lib a little bit heavier but follows Rust's convention by not letting an erroneous function been called. - [ ] read the generated files and check they contains what we want (this will make it possible to check for regression of generated file for the users)
Poster
Owner

Unfortunately my skills with object oriented dev are pretty poor, so reading and understanding the code is difficult for me, sorry ^^"

From what I understand according to your configuration file example and the code, when launching autoupload.py, it load the file, and loop on the different [videos], skipping files which already have urls?

How it deals with errors (settings error string in the file then continue) is not absolutely clear, to retry, we just need to remove the error string?

Or it is retried at each loop?
( line 100 seems to say so )

Also in your file example, you sometimes have

{ peertube = "<peertube-url>", youtube = "<youtube-url>" }

And other time

Episode2-1.mp4 = { upload-time = "2017-12-12T17:42:42", peertube = "<peertube-url>", peertube-publish = "2017-12-12T17:42:42", youtube = "<youtube-url>", youtube-publish = "2017-12-12T17:42:42" }

I do not see in the code how the first case may appears, I guess it’s now obsolete?

I was thinking recently about adding a line directly in NFO, something like

uploaded = true/false

Target is to loop all mp4 in a directoy, and upload only if uploaded = false , and also give an easy way for user to be sure the video has been uploaded or not

Do you think it may help you?

Unfortunately my skills with object oriented dev are pretty poor, so reading and understanding the code is difficult for me, sorry ^^" From what I understand according to your configuration file example and the code, when launching autoupload.py, it load the file, and loop on the different [videos], skipping files which already have urls? How it deals with errors (settings error string in the file then continue) is not absolutely clear, to retry, we just need to remove the error string? Or it is retried at each loop? ( [line 100 seems to say so]( https://git.lecygnenoir.info/Zykino/prismedia-autoupload/src/branch/master/autoupload.py#L100) ) Also in your file example, you sometimes have ``` { peertube = "<peertube-url>", youtube = "<youtube-url>" } ``` And other time ``` Episode2-1.mp4 = { upload-time = "2017-12-12T17:42:42", peertube = "<peertube-url>", peertube-publish = "2017-12-12T17:42:42", youtube = "<youtube-url>", youtube-publish = "2017-12-12T17:42:42" } ``` I do not see in the code how the first case may appears, I guess it’s now obsolete? I was thinking recently about adding a line directly in NFO, something like ``` uploaded = true/false ``` Target is to loop all mp4 in a directoy, and upload only if uploaded = false , and also give an easy way for user to be sure the video has been uploaded or not Do you think it may help you?
Zykino commented 3 years ago
Poster

From what I understand according to your configuration file example and the code, when launching autoupload.py, it load the file, and loop on the different [videos], skipping files which already have urls?

Yes.

How it deals with errors (settings error string in the file then continue) is not absolutely clear, to retry, we just need to remove the error string?
Or it is retried at each loop? (line 100 seems to say so)

In this version of the code an error is expected to be resolved before a retry. We could add this as a condition in the getter for the next video at line 35.

Also in your file example, you sometimes have
{ peertube = "<peertube-url>", youtube = "<youtube-url>" }
[...]
I do not see in the code how the first case may appears, I guess it’s now obsolete?

I do not see this construct anymore. I know I tried multiples format and may have missed to uplase some older/simplers structures.

For your proposition, it could be ok if you have one nfo for each video. But not if you want to have a common nfo. And which one to use if you have multiples nfo ?
It made me want to explain the way I thought about this feature:
To me a user will have multiples videos files that they want to upload. Possibly the videos will be organised into folders the videast may want to keep. The autofiles can be placed where the user want but a main one is usefull to refer to the others. A videast could choose to use one for each of his playlist so he can force the order of upload in case many videos are in wait of upload.
I also wanted the autofiles to keep the upload/error date if someone want to create statistics about his upload habits. I also thought to add publish time, I wasn’t sure of it at the time.

> From what I understand according to your configuration file example and the code, when launching autoupload.py, it load the file, and loop on the different [videos], skipping files which already have urls? Yes. > How it deals with errors (settings error string in the file then continue) is not absolutely clear, to retry, we just need to remove the error string? > Or it is retried at each loop? ([line 100 seems to say so](https://git.lecygnenoir.info/Zykino/prismedia-autoupload/src/branch/master/autoupload.py#L100)) In this version of the code an error is expected to be resolved before a retry. We could add this as a condition in the getter for the next video at [line 35](https://git.lecygnenoir.info/Zykino/prismedia-autoupload/src/branch/master/autoupload.py#L35). > Also in your file example, you sometimes have > `{ peertube = "<peertube-url>", youtube = "<youtube-url>" }` > [...] > I do not see in the code how the first case may appears, I guess it’s now obsolete? I do not see this construct anymore. I know I tried multiples format and may have missed to uplase some older/simplers structures. For your proposition, it could be ok if you have one nfo for each video. But not if you want to have a common nfo. And which one to use if you have multiples nfo ? It made me want to explain the way I thought about this feature: To me a user will have multiples videos files that they want to upload. Possibly the videos will be organised into folders the videast may want to keep. The autofiles can be placed where the user want but a main one is usefull to refer to the others. A videast could choose to use one for each of his playlist so he can force the order of upload in case many videos are in wait of upload. I also wanted the autofiles to keep the upload/error date if someone want to create statistics about his upload habits. I also thought to add publish time, I wasn’t sure of it at the time.
Zykino commented 3 years ago
Poster

While I am thinking about it (but this can be done after the merge, on a different ticket), we can upgrade the nextVideo method by taking into account the publish date of the video.

For this the part of the code that read the configuration file needs to be in a lib that the automatic upload can depend on. And then the nextVideo would load the configuration of all the files not yet uploaded and choose the closest date (or no date, or already pasted date) first and tell to upload this file.

But as said in intro, I think we can first publish a "dumb" working version and make it smarter later.

While I am thinking about it (but this can be done after the merge, on a different ticket), we can upgrade the `nextVideo` method by taking into account the publish date of the video. For this the part of the code that read the configuration file needs to be in a lib that the automatic upload can depend on. And then the `nextVideo` would load the configuration of all the files not yet uploaded and choose the closest date (or no date, or already pasted date) first and tell to upload this file. But as said in intro, I think we can first publish a "dumb" working version and make it smarter later.
Poster
Owner

Mmmh, do you think we may publish the feature in the core code of prismedia, perhaps as a second binary, or do you prefer to have it as a different project?

Both case has pro and con, but as you have done works for test and wrote the code itself outside of Prismedia, perhaps it would be better to publish it a some "plugin" for prismedia, we may link it in the core project, and add documentation in the README 🤔

If you prefer tt have it inside prismedia I am totally open, do not hesitate to do a Merge Request, and we may discuss the best way to integrate, either as an option for prismedia, a side binary, perhaps even a daemon!

Regarding the new features I agree with you, it's better to do the code, then add features once we have something that work.

If you plan to keep it separated from prismedia, perhaps you may open ticket directly in your project on gitea, so you may be able to track all your ideas! (and do not forget them. Memory is the worse wayt to work on code 😂)

Mmmh, do you think we may publish the feature in the core code of prismedia, perhaps as a second binary, or do you prefer to have it as a different project? Both case has pro and con, but as you have done works for test and wrote the code itself outside of Prismedia, perhaps it would be better to publish it a some "plugin" for prismedia, we may link it in the core project, and add documentation in the README 🤔 If you prefer tt have it inside prismedia I am totally open, do not hesitate to do a Merge Request, and we may discuss the best way to integrate, either as an option for prismedia, a side binary, perhaps even a daemon! Regarding the new features I agree with you, it's better to do the code, then add features once we have something that work. If you plan to keep it separated from prismedia, perhaps you may open ticket directly in your project on gitea, so you may be able to track all your ideas! (and do not forget them. Memory is the worse wayt to work on code 😂)
Zykino commented 3 years ago
Poster

As discussed in MP, starting to have plugin sounds like the best option in the long run to not bloat Prismedia. But since neither of us as experience with them it may takes a bit of time.

As discussed in MP, starting to have plugin sounds like the best option in the long run to not bloat Prismedia. But since neither of us as experience with them it may takes a bit of time.
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date

No due date set.

Depends on
Loading…
There is no content yet.