SFTP options


#1

Hello,

I’m designing the API for SFTP connections - I would appreciate feedback from those of you who are interested in using SFTP.

I think I would like to settle on a few operations:

  • get - remote to local, may point to a file or directory, can be recursive
  • put - local, to remote, options as above
  • ls - remote file or directory, equivalent to “ls -la” on a remote host

They would be used along the lines of:

def handle(self):
    sftp = self.out.sftp.get('My Connection')
    out = sftp.get('/path/to/a/remote/file/or/directory')
    
    out.data # Actual data downloaded
    out.stdout = # Any information written to stdout during the task
    out.stderr = # Ditto but for stderr

Probably, there would be nothing like creating local symlinks or changing permissions which is something that sftp the command line tool has but so does Python via os.chmod for instance.

All or most of SSH options specified in “man sftp” would be supported, e.g. IdentityFile, Compression, ServerAliveInterval or bandwidth limits.

Does that sound like a useful subset of the functionality or is there anything else in particular that you would like that it be added?

Thanks.


#2

Hello, @dsuch!

Did you exclude rm on purpose? Also, what about mkdir and rmdir?

Also, how will one work out whether the operation went okay (e.g. file was uploaded -put- successfully)?


#3

I just forgot about it. Yes, remote rm, mkdir and rmdir make sense. I just do not want to implement this kind of functionality for local operations, or at least for now, perhaps with time this can be added for consistency with remote commands.

Sure, there will be some kind of an out.is_ok or out.is_success flag.


#4

Hello,

First, thanks for working on integrating SFTP on Zato. Being a native integration can save people some of the headaches I had when first dealing with the framework.

When I first wanted to replace the FTP system with an SFTP one, based on a customer request, I found out how many parts of the code I would need to replace. I later introduced a class interface with the common methods needed and integrated them with the options at my disposal, so I could more easily replace them wherever needed and maybe this is a good policy to adopt on Zato as well. The more transparent an outgoing file based connection is, the easier it would be to change them in a service without changing code directly. Not sure how hard it would be to satisfy the same methods in different file systems, but so far for FTP and SFTP it was a good approach. I can share the interface signatures tomorrow, when I have the code in front of me.

Right now, I have an interface for the regular FTP supported by Zato (which is based on the ancient fs 0.4.0) and the ssh2-python 0.17.0, which uses an embedded libssh2 library, which allowed my to use a non-blocking mode for SFTP calls.

I’m not sure if the next section will be useful for you or not, but since most of my pains inside Zato were (and some still are) related to SFTP issues, here are some insight in all my discoveries during this exploration:

  • fs 0.4.0 is ancient and I never understood why the library was never updated with new Zato versions. First I though it was a coroutine problem on later versions but even today I have several examples of code using fs calls which blocks the workers (my service never ends properly in such cases);
  • paramiko is unusable, since it depends on libraries which conflict with some embedded libraries on Zato (as of versions 2 and 3), it also does not support non-blocking mode;
  • fs 0.4.0 supports SFTP using paramiko, so it has all the same problems for SFTP support;
  • Newer versions of fs dropped the SFTP support altogether;
  • asyncssh is Python 3 only;
  • Fabric is more command line focused and also powered by paramiko (Ansible has some of the same problems);
  • ssh2-python depends on libssh2 (which implements SFTP calls using C), which has some small problems, like when opening a remote file, some methods have different signatures compared to the standard library. One example is the seek method, which lacks the second parameter to specify where to anchor the reference point, which breaks some integrations (for me it was opening a ZipFile remotely, which did not work and required me to download the file locally for interaction). Even so, it was the closest to perfect I could get at Python 2, with non-blocking mode (it required using gevent sockets and select functions directly to avoid blocking when waiting for network calls);
  • parallel-ssh can use either paramiko or libssh2 underneath, but it has an explicit dependency on paramiko, even if you want to use libssh2 only. In future versions the author will make the paramiko dependency optional (no ETA on this, though).

When Zato becomes a Python 3 framework, some of this problems may be alleviated, but I just wanted to share part of my knowledge, in case you need to pass through the same decision tree on your side.

Will be back tomorrow to share my common interface, which may bring some light into my usage of SFTP.

[]'s


#5

Thanks for the write-up @rtrind - I evaluated all the options and the only feasible way looks to be running sftp the command in a subprocess, which is fine.

As for the Python 3 compatibility - I just to want confirm and emphasize that adding support for Python 3 certainly will not make Zato a Python 3-only platform. Python 2.7 support will be still available.


#6

Here are my method signatures for the common interface. They should be self explanatory. The folder content methods return a folder content, one with only the filenames, another only with a dict of file properties, like size, mtime and so on. Move also functions as rename.

def close(self):
def copy_file(self, _from_path, _to_path, _overwrite=False):
def create_remote_file_from_content(self, _new_file_name, _data):
def download_file(self, _remote_path, _local_path):
def file_exists(self, _path):
def file_info(self, _path):
def folder_content_name_list(self, _path, _wildcard=None, _files_only=True):
def folder_content_info_list(self, _path, _wildcard=None, _files_only=True, _sorted_by_mtime=False):
def move(self, _old_path, _new_path):
def open(self, _filename):
def remove(self, _filename):
def upload_file(self, _local_path, _remote_path):

All methods can throw some custom Exceptions to help dealing with unexpected issues or commands which do not complete properly.

Hope it’s useful!


#7

Nice to haves:
Ability to set regex file filters
The ability to set permissions on destination files
Ability to preserve modification times when putting/getting


#8

Thanks everyone - these are very good suggestions.

I think all of it can be implemented though I would like to keep the names of operations as they appear in the sftp command line utility. That aside, this is great feedback, thank you.